-
Notifications
You must be signed in to change notification settings - Fork 418
Closed
Labels
good first issueGood for newcomersGood for newcomers
Description
Feature Request / Improvement
I noticed that in python, hive, glue and dynamo list all tables, including non-Iceberg ones, in the namespace
iceberg-python/pyiceberg/catalog/hive.py
Lines 488 to 504 in acc934f
| def list_tables(self, namespace: Union[str, Identifier]) -> List[Identifier]: | |
| """List tables under the given namespace in the catalog (including non-Iceberg tables). | |
| When the database doesn't exist, it will just return an empty list. | |
| Args: | |
| namespace: Database to list. | |
| Returns: | |
| List[Identifier]: list of table identifiers. | |
| Raises: | |
| NoSuchNamespaceError: If a namespace with the given name does not exist, or the identifier is invalid. | |
| """ | |
| database_name = self.identifier_to_database(namespace, NoSuchNamespaceError) | |
| with self._client as open_client: | |
| return [(database_name, table_name) for table_name in open_client.get_all_tables(db_name=database_name)] |
iceberg-python/pyiceberg/catalog/glue.py
Lines 584 to 613 in acc934f
| def list_tables(self, namespace: Union[str, Identifier]) -> List[Identifier]: | |
| """List tables under the given namespace in the catalog (including non-Iceberg tables). | |
| Args: | |
| namespace (str | Identifier): Namespace identifier to search. | |
| Returns: | |
| List[Identifier]: list of table identifiers. | |
| Raises: | |
| NoSuchNamespaceError: If a namespace with the given name does not exist, or the identifier is invalid. | |
| """ | |
| database_name = self.identifier_to_database(namespace, NoSuchNamespaceError) | |
| table_list: List[TableTypeDef] = [] | |
| next_token: Optional[str] = None | |
| try: | |
| while True: | |
| table_list_response = ( | |
| self.glue.get_tables(DatabaseName=database_name) | |
| if not next_token | |
| else self.glue.get_tables(DatabaseName=database_name, NextToken=next_token) | |
| ) | |
| table_list.extend(table_list_response["TableList"]) | |
| next_token = table_list_response.get("NextToken") | |
| if not next_token: | |
| break | |
| except self.glue.exceptions.EntityNotFoundException as e: | |
| raise NoSuchNamespaceError(f"Database does not exist: {database_name}") from e | |
| return [(database_name, table["Name"]) for table in table_list] |
However, in java, we apply a filter to only return Iceberg tables in the given namespace:
GlueCatalog.listTables
HiveCatalog.listTables
I forgot if we discussed this before: Why do we choose to include non-iceberg tables in the result in python?
cc @Fokko
geruh
Metadata
Metadata
Assignees
Labels
good first issueGood for newcomersGood for newcomers