-
Notifications
You must be signed in to change notification settings - Fork 412
Allow non-string typed values in table properties #469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow non-string typed values in table properties #469
Conversation
tests/integration/test_writes.py
Outdated
| _ = _create_table( | ||
| session_catalog, identifier, {"format-version": format_version, **property_with_none}, [arrow_table_with_null] | ||
| ) | ||
| assert "NullPointerException: null value in entry: property_name=null" in str(exc_info.value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Fokko this throws a server-side error, not sure if this should be corrected on the server implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we accept null values. Probably we want to catch this in PyIceberg before doing the request. Other backends might have different behavior so we want to make sure that we follow the correct behavior in the client itself.
Fokko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good @kevinjqliu, thanks for working on this 👍
pyiceberg/table/metadata.py
Outdated
| @field_validator("properties", mode='before') | ||
| @classmethod | ||
| def transform_dict_value_to_str(cls, dict: Dict[str, Any]) -> Dict[str, str]: | ||
| assert None not in dict.values(), "None type is not a supported value in properties" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We try to avoid asserts outside of tests/. Could you raise a ValueError instead?
| database_name, _table_name = random_identifier | ||
| catalog.create_namespace(database_name) | ||
| property_with_none = {"property_name": None} | ||
| with pytest.raises(ValidationError) as exc_info: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect an exception being thrown by the field validator, instead of Pydantic itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe Pydantic catches the underlying error and reraise a ValidationError
from the docs,
"""
Pydantic will raise a ValidationError whenever it finds an error in the data it's validating.
Note
Validation code should not raise ValidationError itself, but rather raise a ValueError or AssertionError (or subclass thereof) which will be caught and used to populate ValidationError.
"""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I wasn't aware of that. Looks like ValidationError extends ValueError, so we're good here. Thanks!
tests/integration/test_writes.py
Outdated
| def test_table_properties_raise_for_none_value( | ||
| session_catalog: Catalog, | ||
| arrow_table_with_null: pa.Table, | ||
| format_version: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| format_version: str, | |
| format_version: int, |
tests/integration/test_writes.py
Outdated
| _ = _create_table( | ||
| session_catalog, identifier, {"format-version": format_version, **property_with_none}, [arrow_table_with_null] | ||
| ) | ||
| assert "NullPointerException: null value in entry: property_name=null" in str(exc_info.value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we accept null values. Probably we want to catch this in PyIceberg before doing the request. Other backends might have different behavior so we want to make sure that we follow the correct behavior in the client itself.
6cc20b9 to
b4ea1b7
Compare
|
thanks for the review @Fokko, I've address the comments above, ptal |
pyiceberg/types.py
Outdated
| for value in dict.values(): | ||
| if value is None: | ||
| raise ValueError("None type is not a supported value in properties") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would help the user to show which key is None:
| for value in dict.values(): | |
| if value is None: | |
| raise ValueError("None type is not a supported value in properties") | |
| for key, value in dict: | |
| if value is None: | |
| raise ValueError(f"None type is not a supported value in property: {key}") |
|
|
||
| @pytest.mark.integration | ||
| @pytest.mark.parametrize("format_version", ["1", "2"]) | ||
| @pytest.mark.parametrize("format_version", [1, 2]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice :)
Fokko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! One minor comment 👍 Thanks for working on this @kevinjqliu
b4ea1b7 to
220054d
Compare
|
@Fokko good idea on adding the key! |
|
Thanks @kevinjqliu for fixing this 👍 |
* property accept int https://stackoverflow.com/questions/77304167/using-pydantic-to-change-int-to-string https://docs.pydantic.dev/latest/concepts/validators/\#field-validators * add tests * add integration tests * pr feedback * make validator reusable * show key when none
* property accept int https://stackoverflow.com/questions/77304167/using-pydantic-to-change-int-to-string https://docs.pydantic.dev/latest/concepts/validators/\#field-validators * add tests * add integration tests * pr feedback * make validator reusable * show key when none
Resolves #376
We want to be able to accept types other than string type in the
propertiesfield of Table and TableMetadata.For example, setting a value to an
inttypeThis PR adds a "before" field validator to
TableMetadataCommonFieldswhich will transform the values of thepropertiesdict to str. Note, we explicitly disallowNonetype, since the transformation will changeNoneto str"None"which is unintuitive.References