Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Conversation

@Fokko
Copy link
Contributor

@Fokko Fokko commented Oct 28, 2024

Fixes #1265

@bigluck
Copy link
Contributor

bigluck commented Oct 28, 2024

@Fokko follow up of our discussion on Slack: https://apache-iceberg.slack.com/archives/C029EE6HQ5D/p1730134036731089?thread_ts=1730122956.980119&cid=C029EE6HQ5D

Pyarrow 17 installed numpy too, but starting from pyarrow 18 they removed the dependency.

apache/arrow#44148

the io/pyarrow.py file imports numpy, so it can happens that the import of the pyarrow io strategy fails and it falls back to the s3fs strategy, hoping the user has the package installed on his system.

@Fokko
Copy link
Contributor Author

Fokko commented Oct 28, 2024

@bigluck Thanks, that's a great catch. We only use the positional deletes to combine the positional deletes (when there are more positional deletes per file). It would be great to see if we can remove this and also make the numpy dependency optional. It is quite a big one and would be nice to get rid of.

@kevinjqliu
Copy link
Contributor

kevinjqliu commented Oct 28, 2024

opened #1259 to continue the numpy deprecation conversation.
Optionally, we can temporary bring in numpy as a project dependency before exploring its deprecation

@Fokko
Copy link
Contributor Author

Fokko commented Oct 29, 2024

Keep in mind that the CI passes here because we have numpy as a PySpark dependency :)

@kevinjqliu kevinjqliu added this to the PyIceberg 0.8.0 release milestone Oct 30, 2024
@Fokko Fokko merged commit b2da8c7 into apache:main Oct 30, 2024
@Fokko Fokko deleted the fd-bump-pyarrow branch October 30, 2024 20:29
sungwy pushed a commit to sungwy/iceberg-python that referenced this pull request Dec 7, 2024
sungwy pushed a commit to sungwy/iceberg-python that referenced this pull request Dec 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pyarrow 18 regression: ValueError: type(schema)=<class 'pyarrow.lib.Schema'>

5 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.