Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose stream-ordering in scalar and avro APIs #17766

Open
wants to merge 5 commits into
base: branch-25.02
Choose a base branch
from

Conversation

shrshi
Copy link
Contributor

@shrshi shrshi commented Jan 17, 2025

Description

Contributes to #13744

Replaces conversion operators in derived classes of cudf::scalar with stream-ordered get_value(stream) member function.
Adds stream parameter to cudf::io::read_avro

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Copy link

copy-pr-bot bot commented Jan 17, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Jan 17, 2025
@shrshi shrshi added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jan 17, 2025
@github-actions github-actions bot added the CMake CMake build issue label Jan 18, 2025
@shrshi shrshi marked this pull request as ready for review January 18, 2025 00:22
@shrshi shrshi requested review from a team as code owners January 18, 2025 00:22
@shrshi shrshi requested review from vyasr and davidwendt January 18, 2025 00:22
*/
explicit operator value_type() const;
T get_value(rmm::cuda_stream_view stream = cudf::get_default_stream()) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RMM just calls this value. We often try to avoid “get” in our function names.

Suggested change
T get_value(rmm::cuda_stream_view stream = cudf::get_default_stream()) const;
T value(rmm::cuda_stream_view stream = cudf::get_default_stream()) const;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, doesn’t value already exist just below here? We may not need get_value at all. Maybe we just need to delete the conversion operator since it does not take a stream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - I chose to introduce another function here because value can return either host or device data depending on the type. string_scalar::value returns a cudf::string_view on the device, while fixed_point::value and fixed_width::value copy to host. I was also unsure of using fixed_point::value directly since the conversion operator returns the fixed_point_value instead of the unscaled value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hesitant to modify string_scalar::value to return a host string view since it is used in several places that directly access the scalar on device. One option is to rename this function to string_scalar::d_value and introduce another string_scalar::value that returns a std::string?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string_scalar::value() should return a cudf::string_view which is possible from host or device since it just wraps a pointer and a size held by the string_scalar. No device code is needed. Actually the stream parameter is not used at all.
To return a host std::string one would use the string_scalar::to_string() function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Bradley we should not have the get_value() functions and just keep the value() ones.

*/
explicit operator value_type() const;
T get_value(rmm::cuda_stream_view stream = cudf::get_default_stream()) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should delete this. There are already value and fixed_point_value functions.

*/
explicit operator std::string() const;
std::string get_value(rmm::cuda_stream_view stream = cudf::get_default_stream()) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, remove this.

@@ -81,7 +81,7 @@ void ASSERT_BINOP(cudf::column_view const& out,
TypeOp&& op,
ValueComparator const& value_comparator = ValueComparator())
{
auto lhs_h = static_cast<ScalarType const&>(lhs).operator TypeLhs();
auto lhs_h = static_cast<ScalarType const&>(lhs).get_value(cudf::get_default_stream());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refactor the tests like this:

Suggested change
auto lhs_h = static_cast<ScalarType const&>(lhs).get_value(cudf::get_default_stream());
auto lhs_h = static_cast<ScalarType const&>(lhs).value(cudf::get_default_stream());

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants