-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GCC14] Avoid std::clamp in device code #47266
Conversation
please test |
cms-bot internal usage |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47266/43568 |
A new Pull Request was created by @iarspider for master. It involves the following packages:
@jfernan2, @mandrenguyen can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
please test for el8_amd64_gcc14 |
// which doesn't compile with gcc14 due to reference to __glibcxx_assert | ||
// See https://github.com/llvm/llvm-project/issues/95183 | ||
int tmp_max = std::max<int>(iz, INT8_MIN); | ||
iz = std::min<int>(iz, INT8_MAX); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tmp_max is unused....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should use NVIDIA implementation for cuda backends
https://nvidia.github.io/cccl/libcudacxx/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but we will also need to implement workaround for ROCm (I think it has the same issue)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you check with hipcc? My understanding was that llvm took care of those
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you check with hipcc? My understanding was that llvm took care of those
>> Compiling edm plugin src/RecoLocalTracker/SubCollectionProducers/src/SeedClusterRemover.cc
In file included from <built-in>:1:
In file included from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02874/el8_amd64_gcc14/external/rocm/6.2.4-4e7b4b5933de5a6448978654268b9b7f/lib/llvm/lib/clang/18/include/__clang_hip_runtime_wrapper.h:145:
In file included from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02874/el8_amd64_gcc14/external/rocm/6.2.4-4e7b4b5933de5a6448978654268b9b7f/lib/llvm/lib/clang/18/include/cuda_wrappers/algorithm:55:
In file included from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02874/el8_amd64_gcc14/external/gcc/14.2.1-60e8d7f953e2e96ab45420ba74bbf3ae/lib/gcc/x86_64-redhat-linux-gnu/14.2.1/../../../../include/c++/14.2.1/algorithm:61:
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02874/el8_amd64_gcc14/external/gcc/14.2.1-60e8d7f953e2e96ab45420ba74bbf3ae/lib/gcc/x86_64-redhat-linux-gnu/14.2.1/../../../../include/c++/14.2.1/bits/stl_algo.h:3626:7: error: reference to __host__ function '__glibcxx_assert_fail' in __host__ __device__ function
3626 | __glibcxx_assert(!(__hi < __lo));
| ^
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02874/el8_amd64_gcc14/external/gcc/14.2.1-60e8d7f953e2e96ab45420ba74bbf3ae/lib/gcc/x86_64-redhat-linux-gnu/14.2.1/../../../../include/c++/14.2.1/x86_64-redhat-linux-gnu/bits/c++config.h:614:12: note: expanded from macro '__glibcxx_assert'
614 | std::__glibcxx_assert_fail(); \
| ^
src/RecoLocalTracker/SiPixelRecHits/interface/pixelCPEforDevice.h:382:19: note: called by 'errorFromDB<pixelTopology::Phase1>'
382 | int jx = std::clamp(bin_value, low_value, high_value);
| ^
src/RecoLocalTracker/SiPixelRecHits/plugins/alpaka/PixelRecHits.h:183:34: note: called by 'operator()'
183 | pixelCPEforDevice::errorFromDB<TrackerTraits>(
| ^
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02874/el8_amd64_gcc14/external/alpaka/1.2.0-bfbd7b77995d3af9fba1eba156781229/include/alpaka/kernel/TaskKernelGpuUniformCudaHipRt.hpp:75:13: note: called by 'gpuKernel<alpaka_rocm_async::pixelRecHits::GetHits<pixelTopology::Phase1>, alpaka::ApiHipRt, alpaka::AccGpuUniformCudaHipRt<alpaka::ApiHipRt, std::integral_constant<unsigned long, 1>, unsigned int>, std::integral_constant<unsigned long, 1>, unsigned int, const pixelCPEforDevice::ParamsOnDeviceT<pixelTopology::Phase1> *, const BeamSpotPOD *, SiPixelDigisLayout<>::ConstViewTemplateFreeParams<128, false, true, true>, unsigned int, unsigned int, SiPixelClustersLayout<>::ConstViewTemplateFreeParams<128, false, true, true>, TrackingRecHitSoA<pixelTopology::Phase1>::Layout<>::ViewTemplateFreeParams<128, false, true, true>>'
75 | kernelFnObj(const_cast<TAcc const&>(acc), args...);
| ^
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02874/el8_amd64_gcc14/external/gcc/14.2.1-60e8d7f953e2e96ab45420ba74bbf3ae/lib/gcc/x86_64-redhat-linux-gnu/14.2.1/../../../../include/c++/14.2.1/x86_64-redhat-linux-gnu/bits/c++config.h:608:3: note: '__glibcxx_assert_fail' declared here
608 | __glibcxx_assert_fail()
| ^
In file included from <built-in>:1:
In file included from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02874/el8_amd64_gcc14/external/rocm/6.2.4-4e7b4b5933de5a6448978654268b9b7f/lib/llvm/lib/clang/18/include/__clang_hip_runtime_wrapper.h:145:
In file included from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02874/el8_amd64_gcc14/external/rocm/6.2.4-4e7b4b5933de5a6448978654268b9b7f/lib/llvm/lib/clang/18/include/cuda_wrappers/algorithm:55:
In file included from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02874/el8_amd64_gcc14/external/gcc/14.2.1-60e8d7f953e2e96ab45420ba74bbf3ae/lib/gcc/x86_64-redhat-linux-gnu/14.2.1/../../../../include/c++/14.2.1/algorithm:61:
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02874/el8_amd64_gcc14/external/gcc/14.2.1-60e8d7f953e2e96ab45420ba74bbf3ae/lib/gcc/x86_64-redhat-linux-gnu/14.2.1/../../../../include/c++/14.2.1/bits/stl_algo.h:3626:7: error: reference to __host__ function '__glibcxx_assert_fail' in __host__ __device__ function
3626 | __glibcxx_assert(!(__hi < __lo));
| ^
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02874/el8_amd64_gcc14/external/gcc/14.2.1-60e8d7f953e2e96ab45420ba74bbf3ae/lib/gcc/x86_64-redhat-linux-gnu/14.2.1/../../../../include/c++/14.2.1/x86_64-redhat-linux-gnu/bits/c++config.h:614:12: note: expanded from macro '__glibcxx_assert'
614 | std::__glibcxx_assert_fail(); \
| ^
src/RecoLocalTracker/SiPixelRecHits/interface/pixelCPEforDevice.h:382:19: note: called by 'errorFromDB<pixelTopology::HIonPhase1>'
382 | int jx = std::clamp(bin_value, low_value, high_value);
| ^
src/RecoLocalTracker/SiPixelRecHits/plugins/alpaka/PixelRecHits.h:183:34: note: called by 'operator()'
183 | pixelCPEforDevice::errorFromDB<TrackerTraits>(
| ^
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02874/el8_amd64_gcc14/external/alpaka/1.2.0-bfbd7b77995d3af9fba1eba156781229/include/alpaka/kernel/TaskKernelGpuUniformCudaHipRt.hpp:75:13: note: called by 'gpuKernel<alpaka_rocm_async::pixelRecHits::GetHits<pixelTopology::HIonPhase1>, alpaka::ApiHipRt, alpaka::AccGpuUniformCudaHipRt<alpaka::ApiHipRt, std::integral_constant<unsigned long, 1>, unsigned int>, std::integral_constant<unsigned long, 1>, unsigned int, const pixelCPEforDevice::ParamsOnDeviceT<pixelTopology::HIonPhase1> *, const BeamSpotPOD *, SiPixelDigisLayout<>::ConstViewTemplateFreeParams<128, false, true, true>, unsigned int, unsigned int, SiPixelClustersLayout<>::ConstViewTemplateFreeParams<128, false, true, true>, TrackingRecHitSoA<pixelTopology::HIonPhase1>::Layout<>::ViewTemplateFreeParams<128, false, true, true>>'
75 | kernelFnObj(const_cast<TAcc const&>(acc), args...);
| ^
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02874/el8_amd64_gcc14/external/gcc/14.2.1-60e8d7f953e2e96ab45420ba74bbf3ae/lib/gcc/x86_64-redhat-linux-gnu/14.2.1/../../../../include/c++/14.2.1/x86_64-redhat-linux-gnu/bits/c++config.h:608:3: note: '__glibcxx_assert_fail' declared here
608 | __glibcxx_assert_fail()
| ^
2 errors generated when compiling for gfx1030.
So it seems to fail for ROCm, but not for CUDA (at least I couldn't find any failures for CUDA in the output)
please abort |
4d90f6a
to
04aa605
Compare
please test |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47266/43570 |
please test |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47266/43572 |
Pull request #47266 was updated. @jfernan2, @mandrenguyen can you please check and sign again. |
is cuda that is messing up with the gcc config... |
... as does the manual implementation: https://godbolt.org/z/j1dGET7Pc |
+1 Size: This PR adds an extra 28KB to repository The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic: You can see more details here: Comparison SummarySummary:
|
+1 |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @antoniovilela, @mandrenguyen, @rappoccio (and backports should be raised in the release meeting by the corresponding L2) |
+1 |
PR description:
When testing CMSSW with gcc14, the following error is emitted:
This issue was reported to LLVM and GCC. Fix inspired by pytorch's fix of the same problem.
PR validation:
Bot tests