Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Conversation

@MehulBatra
Copy link
Contributor

@MehulBatra MehulBatra commented Apr 30, 2024

  • Implementation of table_exists method at Initialization level already exists bypassing it.
  • Unit Test for glue, sqlcatalog and dynamodb.
  • Integration test for glue and dynamodb.

Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! thanks for adding tests for different catalogs. I left a few minor comments

Returns:
bool: True if the table exists, False otherwise.
"""
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we get rid of the other table_exists function in the same file?

def table_exists(self, identifier: Union[str, Identifier]) -> bool:
try:
self.load_table(identifier)
return True
except NoSuchTableError:
return False

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think about changing the REST catalog's table_exists implementation to be similar to this?
https://github.com/apache/iceberg-python/pull/512/files#diff-3bda7391ebd8aa3dcfd6703d8d2764830b9d9c35fa854188a37d69611274bd3dR722-R727

maybe calls super().table_exists()?

Copy link
Contributor

@HonahX HonahX May 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better for RestCatalog to maintain a separate implementation to make a head request to /v1/{prefix}/namespaces/{namespace}/tables/{table}, ref: #512 (comment), #507 (comment)

The try-catch implementation is for other non-rest catalogs and thus it is put in the MetastoreCatalog instead of the Catalog interface.

A little bit more context regarding the MetastoreCatalog: while working on #498, we found that many helper functions as well as some implementations are for non-rest catalogs only. So we decide to move those to another layer, MetastoreCatalog to make the inheritance look better: #498 (comment).

Do these sound reasonable to you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, thanks for the detailed explanation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushed the changes, as per the discussion, thank you, guys, for the feedback!

Copy link
Contributor

@HonahX HonahX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! I think we've had the implementation of table_exists.

For non-rest catalog it is

def table_exists(self, identifier: Union[str, Identifier]) -> bool:
try:
self.load_table(identifier)
return True
except NoSuchTableError:
return False

as pointed out by @kevinjqliu

For rest catalog it is

def table_exists(self, identifier: Union[str, Identifier]) -> bool:
"""Check if a table exists.
Args:
identifier (str | Identifier): Table identifier.
Returns:
bool: True if the table exists, False otherwise.
"""
identifier_tuple = self.identifier_to_tuple_without_catalog(identifier)
response = self._session.head(
self.url(Endpoints.load_table, prefixed=True, **self._split_identifier_for_path(identifier_tuple))
)
return response.status_code == 200

But we do miss relevant tests for glue, sql, dynamodb, ...

pyiceberg/catalog/__init__.py Outdated Show resolved Hide resolved
tests/catalog/integration_test_dynamodb.py Outdated Show resolved Hide resolved
Co-authored-by: Honah J. <undefined.newdb.newtable@gmail.com>
Copy link
Contributor

@HonahX HonahX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM! The lint issue can be resolved by executing make lint

@MehulBatra
Copy link
Contributor Author

Overall LGTM! The lint issue can be resolved by executing make lint

Thank you @HonahX and @kevinjqliu for all the feedback!

Copy link
Contributor

@HonahX HonahX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work! This will be good to go when the CI pass.

@kevinjqliu
Copy link
Contributor

Thank you for the PR! Do you mind changing the PR title and description to match?

@MehulBatra MehulBatra changed the title Implement table_exists method with try-catch for NoSuchTableError table_exists unit/integration test for NoSuchTableError May 3, 2024
@HonahX HonahX merged commit 7bd5d9e into apache:main May 4, 2024
@HonahX
Copy link
Contributor

HonahX commented May 4, 2024

Thanks everyone! Merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.