Expose stream-ordering in scalar and avro APIs #17766

shrshi · 2025-01-17T22:49:58Z

Description

Contributes to #13744

Replaces conversion operators in derived classes of cudf::scalar with stream-ordered get_value(stream) member function.
Adds stream parameter to cudf::io::read_avro

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

copy-pr-bot · 2025-01-17T22:50:01Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

bdice · 2025-01-18T00:51:56Z

cpp/include/cudf/scalar/scalar.hpp

   */
-  explicit operator value_type() const;
+  T get_value(rmm::cuda_stream_view stream = cudf::get_default_stream()) const;


RMM just calls this value. We often try to avoid “get” in our function names.

Suggested change

T get_value(rmm::cuda_stream_view stream = cudf::get_default_stream()) const;

T value(rmm::cuda_stream_view stream = cudf::get_default_stream()) const;

Wait, doesn’t value already exist just below here? We may not need get_value at all. Maybe we just need to delete the conversion operator since it does not take a stream.

Good point - I chose to introduce another function here because value can return either host or device data depending on the type. string_scalar::value returns a cudf::string_view on the device, while fixed_point::value and fixed_width::value copy to host. I was also unsure of using fixed_point::value directly since the conversion operator returns the fixed_point_value instead of the unscaled value.

I'm hesitant to modify string_scalar::value to return a host string view since it is used in several places that directly access the scalar on device. One option is to rename this function to string_scalar::d_value and introduce another string_scalar::value that returns a std::string?

The string_scalar::value() should return a cudf::string_view which is possible from host or device since it just wraps a pointer and a size held by the string_scalar. No device code is needed. Actually the stream parameter is not used at all.
To return a host std::string one would use the string_scalar::to_string() function.

I agree with Bradley we should not have the get_value() functions and just keep the value() ones.

bdice · 2025-01-18T00:55:51Z

cpp/include/cudf/scalar/scalar.hpp

   */
-  explicit operator value_type() const;
+  T get_value(rmm::cuda_stream_view stream = cudf::get_default_stream()) const;


Maybe we should delete this. There are already value and fixed_point_value functions.

bdice · 2025-01-18T00:56:16Z

cpp/include/cudf/scalar/scalar.hpp

   */
-  explicit operator std::string() const;
+  std::string get_value(rmm::cuda_stream_view stream = cudf::get_default_stream()) const;


Same, remove this.

bdice · 2025-01-18T00:58:17Z

cpp/tests/binaryop/assert-binops.h

@@ -81,7 +81,7 @@ void ASSERT_BINOP(cudf::column_view const& out,
                  TypeOp&& op,
                  ValueComparator const& value_comparator = ValueComparator())
 {
-  auto lhs_h    = static_cast<ScalarType const&>(lhs).operator TypeLhs();
+  auto lhs_h    = static_cast<ScalarType const&>(lhs).get_value(cudf::get_default_stream());


Please refactor the tests like this:

Suggested change

auto lhs_h = static_cast<ScalarType const&>(lhs).get_value(cudf::get_default_stream());

auto lhs_h = static_cast<ScalarType const&>(lhs).value(cudf::get_default_stream());

shrshi added 2 commits January 17, 2025 22:35

streams to scalar classes

3733668

streams to avro

40f6ba7

github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Jan 17, 2025

github-actions bot assigned shrshi Jan 17, 2025

shrshi added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jan 17, 2025

added test

5d06857

github-actions bot added the CMake CMake build issue label Jan 18, 2025

shrshi marked this pull request as ready for review January 18, 2025 00:22

shrshi requested review from a team as code owners January 18, 2025 00:22

shrshi requested review from vyasr and davidwendt January 18, 2025 00:22

shrshi added 2 commits January 18, 2025 00:25

fix

ffab239

Merge branch 'branch-25.02' into streams-final

a1c5edc

bdice reviewed Jan 18, 2025

View reviewed changes

This was referenced Jan 18, 2025

Expose streams in the Avro Reader #17631

Closed

Add stream parameters in pylibcudf IO APIs #17620

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose stream-ordering in scalar and avro APIs #17766

Expose stream-ordering in scalar and avro APIs #17766

shrshi commented Jan 17, 2025

copy-pr-bot bot commented Jan 17, 2025

bdice Jan 18, 2025

bdice Jan 18, 2025

shrshi Jan 23, 2025

shrshi Jan 23, 2025

davidwendt Jan 23, 2025

davidwendt Jan 23, 2025

bdice Jan 18, 2025

bdice Jan 18, 2025

bdice Jan 18, 2025

	T get_value(rmm::cuda_stream_view stream = cudf::get_default_stream()) const;
	T value(rmm::cuda_stream_view stream = cudf::get_default_stream()) const;

	auto lhs_h = static_cast<ScalarType const&>(lhs).get_value(cudf::get_default_stream());
	auto lhs_h = static_cast<ScalarType const&>(lhs).value(cudf::get_default_stream());

Expose stream-ordering in scalar and avro APIs #17766

Are you sure you want to change the base?

Expose stream-ordering in scalar and avro APIs #17766

Conversation

shrshi commented Jan 17, 2025

Description

Checklist

copy-pr-bot bot commented Jan 17, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment