-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to provide partition spec in spark ADD_FILES procedure #12325
Comments
Please see #12319 |
@RussellSpitzer thanks, I see your change addresses the spec finding in I just created the PR for adding the spec as argument to add_files here #12327 Can you review my PR as I find that this change is still needed, to allow passing spec for the |
I think I want to do something similar here, Instead of passing in the partition spec can we just search the Iceberg table to see if a valid spec exists that matches the FileTable? |
Makes sense to avoid passing the argument if possible. I'll update the PR |
Feature Request / Improvement
Currently, the ADD_FILES API in Apache Iceberg does not support specifying a partition spec, meaning that the API always operates on the latest table spec when adding files, as shown in the implementation.
This can become problematic when the table or partition spec has evolved over time. For instance, in an archival and restore tool where data was archived before the partition spec changed, it would be beneficial to restore archived data using the older partition spec, rather than the current one.
I am working on a PR to address this and would appreciate any specific suggestions or concerns from the community on making this change.
Query engine
Spark
Willingness to contribute
The text was updated successfully, but these errors were encountered: