Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/mrhs misc #1515

Merged
merged 32 commits into from
Dec 5, 2024
Merged

Feature/mrhs misc #1515

merged 32 commits into from
Dec 5, 2024

Conversation

maddyscientist
Copy link
Member

@maddyscientist maddyscientist commented Nov 8, 2024

This PR is a bit of a catch all

  • Adds register talking for MRHS staggered dslash
    • Level of register tiling is controlled by a CMake parameter QUDA_MAX_MULTI_RHS_TILE, with the default left at size 1 for now.
    • This feature will be further developed in subsequent PRs
    • (Although not included in this PR, it's straightforward to this support to other stencils)
  • Update to the latest version of Eigen on the 3.4 branch
    • This has improved support for nvc++ allowing us to remove some prior WARs
  • Fixes performance regressions of the MMA dslash when the memory pool is switched off
    • The FieldTmp now supports creating temporaries using parameters as opposed another field instance
    • We use this to create the temporary used for the reordered quark fields
  • Adds WAR for performance regressions with ROCm
    • This improves performance on ROCm 5.3 by 30% for the Laplace 3-d operator, though it's still off by integer factors
  • Various fixes for nvc++ compilation
  • Add alternative sentinel for heterogeneous reductions in the case that the compiler optimizes away non-finite math (enabled with QUDA_HETEROGENEOUS_ATOMIC_INF_INIT=OFF). Not a problem by default, but is with latest clang with -Ofast.
  • Fix various compiler warnings with more recent compilers, e.g., gcc-15
  • Fixes a hang caused by process divergence when calling printGenericMatrix

@maddyscientist maddyscientist requested review from a team as code owners November 20, 2024 19:01
include/kernels/laplace.cuh Show resolved Hide resolved
lib/inv_mr_quda.cpp Show resolved Hide resolved
include/kernels/dslash_staggered.cuh Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
Copy link
Contributor

@weinbe2 weinbe2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending a few cosmetic requests this looks good. Thanks @maddyscientist !

@maddyscientist maddyscientist merged commit a54595d into develop Dec 5, 2024
7 checks passed
@maddyscientist maddyscientist deleted the feature/mrhs-misc branch December 5, 2024 05:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants