Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clang UB sanitizer CI test: increase coverage #5597

Open
wants to merge 9 commits into
base: development
Choose a base branch
from

Conversation

lucafedeli88
Copy link
Member

@lucafedeli88 lucafedeli88 commented Jan 23, 2025

In CI test based on clang UB sanitizers, most of the time (~ 1h30) is spent in compiling the code, while just a few minutes are spent actually running some simulations. This means that we can increase the coverage of the test by adding some more simulations to the tests with a negligible increase of the total runtime.
This PR does just that: now most of the cases in Examples/Physics_applications are tested with the UB sanitizer.

Note that some cases cannot run in double precision (see below). For this reason, the PR also splits the UB sanitizer test into single precision and double precision (in double precision only the cases that cannot run in single precision are tested).

Updates:

Issue found while running inputs_test_3d_beam_beam_collision --> We need to run this case in double precision

The tool has found an issue while running mpirun -n 2 ./build/bin/warpx.3d Examples/Physics_applications/beam_beam_collision/inputs_test_3d_beam_beam_collision :

STEP 1 starts ...
/home/runner/work/WarpX/WarpX/build/_deps/fetchedpicsar-src/multi_physics/QED/include/picsar_qed/containers/picsar_tables.hpp:310:17: runtime error: -nan is outside the range of representable values of type 'int'
/home/runner/work/WarpX/WarpX/build/_deps/fetchedpicsar-src/multi_physics/QED/include/picsar_qed/containers/picsar_tables.hpp:310:17: runtime error: -nan is outside the range of representable values of type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/runner/work/WarpX/WarpX/build/_deps/fetchedpicsar-src/multi_physics/QED/include/picsar_qed/containers/picsar_tables.hpp:310:17 in 
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/runner/work/WarpX/WarpX/build/_deps/fetchedpicsar-src/multi_physics/QED/include/picsar_qed/containers/picsar_tables.hpp:310:17 in 

I've temporarily commented out this case while I investigate the cause. For the moment, I am not able to reproduce the issue on my local machine. The issue is using single precision for this specific test case! Specifically, momenta end up being NaN and the sanitizer detects the attempt to convert a floating point NaN into an integer.

Issue found while running inputs_test_2d_background_mcc --> We need to run this case in double precision

MLMG does not converge in single precision for this simulation case. We need to run it in double precision.

Issue found while running free_electron_laser --> We need to run this case in double precision

I have observed this issue:

 STEP 444 starts ...
0::1::Assertion `m_current_z_lab[i_buffer] >= m_buffer_domain_lab[i_buffer].lo(m_moving_window_dir) and m_current_z_lab[i_buffer] <= m_buffer_domain_lab[i_buffer].hi(m_moving_window_dir)' failed, file "/home/runner/work/WarpX/WarpX/Source/Diagnostics/BTDiagnostics.cpp", line 870, Msg: 
 ### ERROR   : z-slice in lab-frame (0.299976) is outside the buffer domain
#            physical extent (0.299976 to 0.299988).
 !!!

which seems to be related to using single precision instead of double precision. Therefore, we need to run this case in double precision.

Issue found while running inputs_test_2d_laser_ion_acc --> bugfix in WarpX

inputs_test_2d_laser_ion_acc case has the following issue in single precision:

--- INFO    : Writing openPMD file diags/openPMDbw000000
terminate called after throwing an instance of 'std::runtime_error'
  what():  Datatypes of chunk data (FLOAT) and record component (DOUBLE) do not match.
SIGABRT
See Backtrace.0.0 file for details

This comes from the fact that the datatype of this dataset in ParticleHistogram2D.cpp is hard-coded as double:

    auto dataset = io::Dataset(
            io::determineDatatype<double>(),
            {static_cast<unsigned long>(m_bin_num_ord), static_cast<unsigned long>(m_bin_num_abs)});

this PR modifies these lines as follows:

    auto dataset = io::Dataset(
            io::determineDatatype<amrex::Real>(),
            {static_cast<unsigned long>(m_bin_num_ord), static_cast<unsigned long>(m_bin_num_abs)});

@lucafedeli88 lucafedeli88 added the component: tests Tests and CI label Jan 23, 2025
@lucafedeli88 lucafedeli88 changed the title Clang UB sanitizer CI test: increase coverage [WIP] Clang UB sanitizer CI test: increase coverage Jan 23, 2025
@lucafedeli88 lucafedeli88 changed the title [WIP] Clang UB sanitizer CI test: increase coverage Clang UB sanitizer CI test: increase coverage Jan 24, 2025
@lucafedeli88 lucafedeli88 requested review from EZoni and ax3l January 24, 2025 09:32
@@ -298,7 +298,7 @@ void ParticleHistogram2D::WriteToFile (int step) const
data.setPosition<amrex::Real>({0.5, 0.5});

auto dataset = io::Dataset(
io::determineDatatype<double>(),
io::determineDatatype<amrex::Real>(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't the properties be in ParticleReal?

Suggested change
io::determineDatatype<amrex::Real>(),
io::determineDatatype<amrex::ParticleReal>(),

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: tests Tests and CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants