Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Conversation

@Gawaboumga
Copy link

Rationale for this change

Iceberg V3 supports both Geography & Geometry types.

Are these changes tested?

Not enough

Copy link
Contributor

@rambleraptor rambleraptor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting this! I've got a couple of suggestions

else:
raise ValidationError(f"Could not parse {geography} into a GeographyType")
elif isinstance(geography, dict):
return geography["crs"], geography["edge_algorithm"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you throw a user-readable error if either of these do not exist on the dictionary?

Copy link
Author

@Gawaboumga Gawaboumga Jul 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have created a _raise_if_any_missing_dictionary_key function, and applied it also to DecimalType & FixedType

else:
raise ValidationError(f"Could not parse {geometry} into a GeometryType")
elif isinstance(geometry, dict):
return geometry["crs"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you throw a user-readable error if either of these do not exist on the dictionary?



def test_repr_geometry() -> None:
assert repr(GeometryType()) == "GeometryType(crs=None)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the spec, crs can not be None.

Image

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Java, they took the position to store nulls, here it was not clear wether __str__ should behaved as ser_model. Nonetheless, it has been changed for both.

@petern48
Copy link
Contributor

petern48 commented Sep 9, 2025

@Gawaboumga were you still planning to work on this? If not, I'd be happy to take it over to get this over the finish line sooner. I've worked a lot with geospatial data, so I'm fairly familiar already.

@Gawaboumga
Copy link
Author

Do you know what's currently missing in this implementation ?
I also wanted to use geospatial data but as this project is not mature, I had to find an alternative ...

@petern48
Copy link
Contributor

I honestly haven't dug deep into this myself yet, so I'm not sure what else or how much is left for this implementation. But getting the tests to pass would be a good next step. From there, we should ensure that reading and writing geo data works properly when using the public api.

pyproject.toml Outdated
tenacity = ">=8.2.3,<10.0.0"
pyroaring = ">=1.0.0,<2.0.0"
pyarrow = { version = ">=21.0.0", optional = true }
pyarrow = { version = ">=17.0.0", optional = true }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, pyarrow version got bumped back down when you merged main in. We need 21.0.0 for geo support

@petern48
Copy link
Contributor

@Gawaboumga, It looks like workflows won't run automatically since it's your first PR in this repo, so it would probably be easiest for you to run CI in your own fork to debug (otherwise, a maintainer would need to manually approve CI to run after each commit).

Were you planning to continue this soon? If not, may I take over? I'm not yet familiar enough with this repo to know what's missing, but I think I can figure it all out if I start from scratch and build out this feature myself. I've got a bit of extra time on my hands nowadays, so now would be a good time.

@jiayuasu
Copy link
Member

Note that this PR should be aware of this PR: apache/iceberg#12667

The Iceberg Geo implementation is somehow blocked by if we want to drop the longitude wraparound of geometries.

@Gawaboumga Gawaboumga closed this Oct 5, 2025
@Gawaboumga Gawaboumga force-pushed the Add-GeographyType-/-GeometryType branch from ed28daa to 34265fb Compare October 5, 2025 14:13
@Gawaboumga Gawaboumga reopened this Oct 5, 2025
@Gawaboumga
Copy link
Author

I recreated the branch to make it cleaner.
apache/iceberg#12667 would make more sense to be implemented in its own PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.