Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add LastBy / FirstBy features to NaturalJoin #6604

Merged
merged 19 commits into from
Feb 5, 2025

Conversation

lbooker42
Copy link
Contributor

@lbooker42 lbooker42 commented Jan 29, 2025

Allows user to specify the row to include when there are duplicates of RHS rows.

Choices are:

  • ERROR_ON_DUPLICATES - original NJ behavior
  • FIRST_MATCH - equivalent to running FirstBy + NaturalJoin
  • LAST_MATCH - equivalent to running LastBy + NaturalJoin
  • EXACT_MATCH - equivalent of exactJoin, requires exactly one RHS row for every LHS row

@lbooker42 lbooker42 changed the title Nightly/dh 18491 lab optimized nj feat: add LastBy / FirstBy features to NaturalJoin Jan 29, 2025
@lbooker42 lbooker42 added this to the 0.38.0 milestone Jan 31, 2025
Copy link
Contributor Author

@lbooker42 lbooker42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self review

@lbooker42 lbooker42 marked this pull request as ready for review February 3, 2025 20:08
py/client/pydeephaven/table.py Show resolved Hide resolved
py/server/tests/test_table.py Outdated Show resolved Hide resolved
py/server/tests/test_table.py Outdated Show resolved Hide resolved
Comment on lines +347 to +348
result_table_2 = left_table.natural_join(right_table_raw, on="key", joins="rhs_index=index", type=NaturalJoinType.LAST_MATCH)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to test the other enum values? The same question for the client test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. The options are tested heavily in Java already and we are primarily ensuring GRPC and the python client are plumbed correctly.


Returns:
self
"""
return super().natural_join(table, on, joins)
return super().natural_join(table, on, joins, type)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a simple test in test_query.py. I am not insistent on that.

@lbooker42 lbooker42 requested a review from cpwright February 5, 2025 00:47
@cpwright cpwright enabled auto-merge (squash) February 5, 2025 16:52
@cpwright cpwright disabled auto-merge February 5, 2025 16:52
@lbooker42 lbooker42 enabled auto-merge (squash) February 5, 2025 16:56
@lbooker42 lbooker42 merged commit 46579d3 into deephaven:main Feb 5, 2025
16 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Feb 5, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants