Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support div_wrapping/rem_wrapping for numeric arithmetic kernels #7159

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

wForget
Copy link
Member

@wForget wForget commented Feb 20, 2025

Which issue does this PR close?

Closes #7158.

Rationale for this change

The div/rem operations may also overflow, so I want to add div_wrapping/rem_wrapping operations to wrapping on overflow. Make it consistent with the logic of add/sub/mul so that we can implement fail_on_overflow feature for div/rem in downstream datafusion.

What changes are included in this PR?

add div_wrapping/rem_wrapping functions

Are there any user-facing changes?

yes, newly added div_wrapping/rem_wrapping functions

@github-actions github-actions bot added the arrow Changes to the arrow crate label Feb 20, 2025
@tustvold
Copy link
Contributor

The absence of these kernels is intentional - #2647

@wForget
Copy link
Member Author

wForget commented Feb 20, 2025

The absence of these kernels is intentional - #2647

Thank you for the information, if I understand correctly, it is to avoid inconsistent behavior caused by unchecked divide by zero returning NaN. This PR uses mod_wrapping method directly which keeps panics on divide by zero, only wrapping on overflow for signed_integer::MIN / -1.

@tustvold
Copy link
Contributor

The motivation for the wrapping variants is solely performance, as they vectorise better and are therefore orders of magnitude faster. This does not apply to the division operators, I'm therefore unclear on why one would want an unchecked version, when it will be no faster

@wForget
Copy link
Member Author

wForget commented Feb 20, 2025

The motivation for the wrapping variants is solely performance, as they vectorise better and are therefore orders of magnitude faster. This does not apply to the division operators, I'm therefore unclear on why one would want an unchecked version, when it will be no faster

Thank you for your explanation. My motivation was just to provide div/rem implementation similar to add_wrapping/sub_wrapping that does not fail when overflow occurs. That way we can easily provide consistent behavior for binary operators in Datafusion. BinaryExpr in Datafusion defines a fail_on_overflow property which when false means it will not fail on overflow, and there is a similar behavior in Spark.

@tustvold
Copy link
Contributor

If I'm not mistaken, these new kernels will panic on divide by zero?

@wForget
Copy link
Member Author

wForget commented Feb 21, 2025

If I'm not mistaken, these new kernels will panic on divide by zero?

Yes, should divide by zero be considered as overflow?

image

@tustvold
Copy link
Contributor

tustvold commented Feb 21, 2025

Yes, should divide by zero be considered as overflow?

At the very least it shouldn't panic 😅

I personally feel it would be better to simply document that fail_on_overflow doesn't impact division operators. It seems a pretty tough pill to effectively double the codegen in order to handle a single set of arguments that overflows... Especially when in general overflow leads to an error (e.g. in the cast kernels).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support div_wrapping/rem_wrapping for numeric arithmetic kernels
2 participants