Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for v3 format in Iceberg #24455

Closed
wants to merge 2 commits into from

Conversation

ebyhr
Copy link
Member

@ebyhr ebyhr commented Dec 12, 2024

Description

This is preparatory tasks for adding support for v3 new features (e.g. variant type, timestamp nanos, deletion vectors and etc).
I didn't change the default version because the default value in Iceberg is still v2.
https://github.com/apache/iceberg/blob/fe2f593cd025223e4ab5ab41a296fb106ce3b1cf/core/src/main/java/org/apache/iceberg/TableMetadata.java#L54

Release notes

## Iceberg
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Dec 12, 2024
@github-actions github-actions bot added docs iceberg Iceberg connector labels Dec 12, 2024
@ebyhr ebyhr force-pushed the ebi/iceberg-v3-max branch 2 times, most recently from 62274e9 to f48975a Compare December 12, 2024 05:51
raunaqmorarka
raunaqmorarka previously approved these changes Dec 12, 2024
Copy link
Member

@raunaqmorarka raunaqmorarka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming there are no prerequisites to declaring v3 support in iceberg tables

@ebyhr
Copy link
Member Author

ebyhr commented Dec 12, 2024

I re-confirmed v3 spec. Looks like we should disallow the following write operation. Let me change the implementation.

https://iceberg.apache.org/spec/#version-3

Writers are not allowed to add new position delete files to v3 tables

@ebyhr
Copy link
Member Author

ebyhr commented Dec 12, 2024

Added a logic to disallow modifying rows with v3 to IcebergMetadata.beginMerge method.

@ebyhr ebyhr force-pushed the ebi/iceberg-v3-max branch 2 times, most recently from f15c9d7 to 2b1f3ba Compare December 12, 2024 11:28
@ebyhr ebyhr requested a review from raunaqmorarka December 12, 2024 11:47
@ebyhr ebyhr force-pushed the ebi/iceberg-v3-max branch from 2b1f3ba to a798b1d Compare December 24, 2024 08:51
@ebyhr
Copy link
Member Author

ebyhr commented Dec 24, 2024

@raunaqmorarka Please take another look.

@ebyhr ebyhr force-pushed the ebi/iceberg-v3-max branch from a798b1d to 2ced776 Compare January 9, 2025 01:40
@hashhar hashhar removed their request for review January 10, 2025 13:26
@@ -865,8 +865,8 @@ connector using a {doc}`WITH </sql/create-table-as>` clause.
- Optionally specifies the file system location URI for the table.
* - `format_version`
- Optionally specifies the format version of the Iceberg specification to use
for new tables; either `1` or `2`. Defaults to `2`. Version `2` is required
for row level deletes.
for new tables; either `1`, `2` or `3`. Defaults to `2`. Only version `2`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get what exactly is the user facing impact of this change.
I see that we disallow altering table to v3 and update/merge.
Are reads, optimize and inserts to existing v3 tables allowed ? Is it valid to allow them while we don't have support for v3 deletion vectors ? Do we error out only if a v3 deletion vector is encountered ?
I think it would be nicer if we just added support for deletion vector along with v3 support, if not sure what's the benefit of supporting v3 without that.

Copy link
Member Author

@ebyhr ebyhr Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will rework once they publish API for deletion vectors.

@ebyhr ebyhr closed this Jan 14, 2025
@ebyhr ebyhr deleted the ebi/iceberg-v3-max branch January 14, 2025 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

2 participants