Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add table upsert support #1660

Merged
merged 59 commits into from
Feb 13, 2025
Merged

Add table upsert support #1660

merged 59 commits into from
Feb 13, 2025

Conversation

kevinjqliu
Copy link
Contributor

@kevinjqliu kevinjqliu commented Feb 13, 2025

Closes #402
This PR adds the upsert function to the Table class and supports the following upsert operations:

  • when matched update all
  • when not matched insert all

This PR is a remake of #1534 due to some infrastructure issues. For additional context, please refer to that PR.

mattmartin14 and others added 30 commits January 14, 2025 11:15
…inal tweaks to test code soon to paramaterize for pytests
@kevinjqliu kevinjqliu requested a review from Fokko February 13, 2025 18:46
@kevinjqliu
Copy link
Contributor Author

kevinjqliu commented Feb 13, 2025

cc reviewers from the other PR (@Fokko / @corleyma / @tscottcoombes1 / @marcoaanogueira) and the original PR author @mattmartin14

@mattmartin14
Copy link
Contributor

cc reviewers from the other PR (@Fokko / @corleyma / @tscottcoombes1 / @marcoaanogueira) and the original PR author @mattmartin14

Looks great to me. Thanks for getting this over the goal line. I'm excited to get this into other's hands.

@bitsondatadev
Copy link
Contributor

Hey @kevinjqliu, just a note after all the changes are done, it may be best to manually squash the commits and make sure Matt is the author for that...not sure what GitHub will do with Matt's treasure trove of commits move across PRs and your recent ones.

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Thanks again @mattmartin14

Regarding @bitsondatadev's comment, when you do a squash and commit, you'll see multiple authors, example can be found here. When you do a git blame, it will point out Matt himself (except the lines that @kevinjqliu touched) :)

@mattmartin14
Copy link
Contributor

This looks great! Thanks again @mattmartin14

Regarding @bitsondatadev's comment, when you do a squash and commit, you'll see multiple authors, example can be found here. When you do a git blame, it will point out Matt himself (except the lines that @kevinjqliu touched) :)

Good enough for me 😂. I'm famous!!!

@mattmartin14
Copy link
Contributor

@Fokko @kevinjqliu - should I go ahead and close the old PR now?

@Fokko Fokko merged commit 6351066 into apache:main Feb 13, 2025
7 checks passed
@Fokko
Copy link
Contributor

Fokko commented Feb 13, 2025

@mattmartin14 Yes, please go ahead. Thanks everyone for driving this, @mattmartin14 in particular!

@bitsondatadev
Copy link
Contributor

Great work @mattmartin14 👏🏻 👏🏻

If I'm not mistaken this is your first PR merged in any open source project correct?

Not a bad first feature! Mine was adding array types for the Elasticsearch connector in Trino...you know this because in the docs example, the timestamp_field is my birthday, the array_int_field is the number to call Jenny, and the int_field is my lucky number.

You also know this was acquired later by Presto since they copied it to their docs 😈.

@mattmartin14
Copy link
Contributor

Great work @mattmartin14 👏🏻 👏🏻

If I'm not mistaken this is your first PR merged in any open source project correct?

Not a bad first feature! Mine was adding array types for the Elasticsearch connector in Trino...you know this because in the docs example, the timestamp_field is my birthday, the array_int_field is the number to call Jenny, and the int_field is my lucky number.

You also know this was acquired later by Presto since they copied it to their docs 😈.

Correct, this was my first PR to open source; I'm 1/1 😁. And I learned a lot! I was unaware that you added the array type to trino. That is some cool stuff.

Thanks again all,
Matt

@kevinjqliu kevinjqliu deleted the StateFarmIns/main branch February 14, 2025 00:00
@kevinjqliu
Copy link
Contributor Author

Thanks everyone for getting this over the finish line! Upsert has been a long awaited feature. I'm excited to include this as part of the upcoming 0.9.0 release.

This is truly a team effort and really demonstrates the power of open source and working in the open. Cheers!

@kevinjqliu kevinjqliu added this to the PyIceberg 0.9.0 release milestone Feb 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Merge into / Upsert
4 participants