-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support arbitrary number of WHEN THEN clauses in the scalar CASE function #14125
Support arbitrary number of WHEN THEN clauses in the scalar CASE function #14125
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #14125 +/- ##
============================================
+ Coverage 61.75% 63.80% +2.05%
- Complexity 207 1532 +1325
============================================
Files 2436 2621 +185
Lines 133233 144028 +10795
Branches 20636 22046 +1410
============================================
+ Hits 82274 91902 +9628
- Misses 44911 45327 +416
- Partials 6048 6799 +751
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
3791518
to
d65c1de
Compare
|
||
|
||
@ScalarFunction(nullableParameters = true, names = {"case", "caseWhen"}) | ||
public class CaseWhenScalarFunction implements PinotScalarFunction { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to have a class for it? Seems it doesn't involve polymorphism, so isVarArg
should be able to handle it. See ArrayFunctions.arrayValueConstructor
as an example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do need to validate that there are 2 or more arguments to the function and from what I can tell, we currently can't do that with the scalar function approach. It shouldn't be too difficult to add another parameter to the ScalarFunction
annotation class for such a validation, but it seems cleaner to use the PinotScalarFunction
class approach instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the registered SqlStdOperatorTable.CASE
perform this type matching? Ideally the type matching should happen in the SqlOperator
instead of here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The standard SqlCaseOperator
does have type matching logic, but actually I just realized that we don't need any validation for checking whether there are 2 or more arguments because the parser itself takes care of that in both the v1 and v2 engines. I've removed this class and moved it back as a method in ObjectFunctions
.
return FUNCTION_INFO; | ||
} | ||
|
||
public static Object caseWhen(Object... objs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to align the behavior with transform function, do we need to calculate the result type and do a cast in the end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The transform function needs to do that because the interface requires it to define the result type; it is also able to do that fairly easily because all its input operands are also transform functions. Here, where all the arguments are simply objects, I'm guessing we'll need to use PinotDataType
and this util method to determine the Pinot type for each argument and then try to find a common type (which will only work for numeric types)? Do we want to fail in case there are heterogeneous types (apart from the string representation of numeric types mixed with numeric types)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In v2, will the SqlOperator
derived the correct return type, and then cast the return value to the desired type? If so, then we get consistent behavior in v2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the SqlCaseOperator
has similar logic to determine the "least restrictive" type across all the then
clauses and the else
clause and that is determined as the return type for the operator - https://github.com/apache/calcite/blob/78e873d39c0364f9f36055b9cbe0600dfad49c71/core/src/main/java/org/apache/calcite/sql/fun/SqlCaseOperator.java#L219-L323. We then cast the actual returned value from the scalar function to that computed type (or the Pinot equivalent type for the Calcite RelDataType
) in the v2 engine. So it's only the v1 engine where the scalar function return type won't be determined based on the input operand types.
d65c1de
to
3ea8e0d
Compare
ColumnDataType resultType = FunctionUtils.getColumnDataType(_functionInvoker.getResultClass()); | ||
// Handle unrecognized result class with STRING | ||
_resultType = resultType != null ? resultType : ColumnDataType.STRING; | ||
|
||
if (!_functionInvoker.getMethod().isVarArgs()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand this check, but is there a reason why we need to move the code after resultType
is declared? As far as I can see the rest of the code is the same, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the diff is slightly confusing, all I did was to separate the common parts and the part which should only be executed when the scalar function is not varargs.
pinot/pinot-core/src/main/java/org/apache/pinot/core/operator/transform/function/CaseTransformFunction.java
Lines 77 to 99 in 19b79f4
IllegalArgumentException: Unsupported function: CASE with argument types: ...
/IllegalArgumentException: Unsupported function: CASE with ... arguments
whereas some other queries with > 15 when then clauses in a case function will succeed.PostAggregationFunction
class to be able to use variadic argument scalar functions similar to the changes done in other call-sites of scalar function invokers.