-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Exchange before GroupId to improve Partial Aggregation #24047
base: master
Are you sure you want to change the base?
Conversation
95fb49c
to
62aaab6
Compare
Thanks for the release note entry! Minor formatting nits, and include the PR number.
|
62aaab6
to
fa61dfd
Compare
fa61dfd
to
39222b3
Compare
86b145a
to
5868a2f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
high level first pass. Seems good for the most part. I will take another pass and look at the details of the rule tomorrow.
presto-main/src/main/java/com/facebook/presto/SystemSessionProperties.java
Show resolved
Hide resolved
.filter(entry -> entry.getCount() >= groupId.getGroupingSets().size() * GROUPING_SETS_SYMBOL_REQUIRED_FREQUENCY) | ||
.map(Multiset.Entry::getElement) | ||
// And only the symbols used in the aggregation (these are usually all symbols) | ||
.peek(symbol -> verify(groupingKeys.contains(symbol))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want peek+verify here? I'm concerned about the case where verify fails. Will an exception fail the query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the query should fail. I will add a verification message to make this clearer
private static final double GROUPING_SETS_SYMBOL_REQUIRED_FREQUENCY = 0.5; | ||
private static final double ANTI_SKEWNESS_MARGIN = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there any experimentation with these parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I just lef them as-is while porting the change over from trinodb/trino#105. Test's for this weren't added either, and I could not come up with a good integ test to experiment with these.
As users test this feature out (disabled by default), we can tweak & test
...ook/presto/sql/planner/TestLogicalAddExchangesBelowPartialAggregationOverGroupIdRuleSet.java
Show resolved
Hide resolved
5868a2f
to
437a09a
Compare
...presto/sql/planner/iterative/rule/AddExchangesBelowPartialAggregationOverGroupIdRuleSet.java
Outdated
Show resolved
Hide resolved
return false; | ||
} | ||
|
||
return isEnabledAddExchangeBelowGroupId(session); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe this should have 3 possible values - ALWAYS, COST_BASED, and NEVER (similar to partial aggregation pushdown). that way someone can enable this if they don't have stats or if the stats estimates are no good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would we re-partition on if ALWAYS is chosen (for the non-trivial case of more than one partition variable) ?
...to/sql/planner/iterative/rule/TestAddExchangesBelowPartialAggregationOverGroupIdRuleSet.java
Outdated
Show resolved
Hide resolved
} | ||
|
||
@Test | ||
public void testAddExchangesWithoutProjection() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about a withProjection test. Also a test that it doesn't fire if it's disabled, only has one grouping set, has pass through keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'only has one grouping set' , added with this test
withProjection, does not fire if disabled -> will add
only has one grouping set, has pass through keys -> I could not build a use-case where this occurs. My understanding of when this could occur is unclear. Can you help me out with an example ?
Based on: trinodb/trino@dc1d66fb co-authored-by: Piotr Findeisen <[email protected]> Based on : trinodb/trino@c573b34 co-authored-by: Lukasz Stec <[email protected]> Based on: trinodb/trino@29328d3 co-authored-by: praveenkrishna <[email protected]>
437a09a
to
979d204
Compare
The minor formatting nits should still apply, but new release note guidelines as of last week: PR #24354 automatically adds links to this PR to the release notes. Please remove the manual PR link in the following format from the release note entries for this PR.
I have updated the Release Notes Guidelines to remove the examples of manually adding the PR link. |
See #23475 for more details
Previously closed PR - #11741
Description
Motivation and Context
See Javadoc of the new
AddExchangesBelowPartialAggregationOverGroupIdRuleSet
Impact
Better performance for TPCDS Q22, Q67
See plan diffs (TPCDS SF 1000, unpartitioned) - https://aaneja.github.io/mypages/PR_24047_AddExchangesBelowPartialAggregationOverGroupId_OffVsOn.html
Test Plan
TODO : Add a new planner test
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.