-
Notifications
You must be signed in to change notification settings - Fork 412
#1820 Add Geography/Geometry type #2224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
#1820 Add Geography/Geometry type #2224
Conversation
rambleraptor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for submitting this! I've got a couple of suggestions
| else: | ||
| raise ValidationError(f"Could not parse {geography} into a GeographyType") | ||
| elif isinstance(geography, dict): | ||
| return geography["crs"], geography["edge_algorithm"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you throw a user-readable error if either of these do not exist on the dictionary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have created a _raise_if_any_missing_dictionary_key function, and applied it also to DecimalType & FixedType
| else: | ||
| raise ValidationError(f"Could not parse {geometry} into a GeometryType") | ||
| elif isinstance(geometry, dict): | ||
| return geometry["crs"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you throw a user-readable error if either of these do not exist on the dictionary?
tests/test_types.py
Outdated
|
|
||
|
|
||
| def test_repr_geometry() -> None: | ||
| assert repr(GeometryType()) == "GeometryType(crs=None)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Java, they took the position to store nulls, here it was not clear wether __str__ should behaved as ser_model. Nonetheless, it has been changed for both.
|
@Gawaboumga were you still planning to work on this? If not, I'd be happy to take it over to get this over the finish line sooner. I've worked a lot with geospatial data, so I'm fairly familiar already. |
|
Do you know what's currently missing in this implementation ? |
|
I honestly haven't dug deep into this myself yet, so I'm not sure what else or how much is left for this implementation. But getting the tests to pass would be a good next step. From there, we should ensure that reading and writing geo data works properly when using the public api. |
pyproject.toml
Outdated
| tenacity = ">=8.2.3,<10.0.0" | ||
| pyroaring = ">=1.0.0,<2.0.0" | ||
| pyarrow = { version = ">=21.0.0", optional = true } | ||
| pyarrow = { version = ">=17.0.0", optional = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, pyarrow version got bumped back down when you merged main in. We need 21.0.0 for geo support
|
@Gawaboumga, It looks like workflows won't run automatically since it's your first PR in this repo, so it would probably be easiest for you to run CI in your own fork to debug (otherwise, a maintainer would need to manually approve CI to run after each commit). Were you planning to continue this soon? If not, may I take over? I'm not yet familiar enough with this repo to know what's missing, but I think I can figure it all out if I start from scratch and build out this feature myself. I've got a bit of extra time on my hands nowadays, so now would be a good time. |
|
Note that this PR should be aware of this PR: apache/iceberg#12667 The Iceberg Geo implementation is somehow blocked by if we want to drop the longitude wraparound of geometries. |
ed28daa to
34265fb
Compare
|
I recreated the branch to make it cleaner. |

Rationale for this change
Iceberg V3 supports both Geography & Geometry types.
Are these changes tested?
Not enough