Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Support Defining PartitionSpec and SortOrder without field-ids in create_table #338

Copy link
Copy link
@sungwy

Description

@sungwy
Issue body actions

Feature Request / Improvement

Currently, create_table API only supports defining partition fields and sort fields by using PartitionSpec and SortOrder respectively.

PartitionField and SortField have the constraint that their fields are defined using field-ids.

With #305 we now allow users to define new table schema for create_table operation without field-ids. This aligns with the usage pattern in Spark Iceberg DDLs, and allows users to use a PyArrow schema that may not have field-ids in creating an Iceberg Table.

Similarly, we would like to support defining PartitionSpec and SortOrder without field_ids when we call create_table.

One early idea involves making changes to create_table API and perhaps the underlying functions (like assign_fresh_partition_spec_ids and assign_fresh_sort_order_ids) to allow users to define partition specs and sort order on create_table without field IDs (since new field IDs are generated for these tables anyways).

Another idea includes creating the schema first without committing the table creation using stage-create, and then using the generated schema and Partition Evolution to commit the new table.

atvaccaro and mike-luabase

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.