Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][libclc] Add generic addrspace overloads of math builtins #13015

Merged
merged 8 commits into from
Mar 21, 2024

Conversation

frasercrmck
Copy link
Contributor

@frasercrmck frasercrmck commented Mar 13, 2024

The generic implementations of the math builtins which take pointer arguments were using unqualified address spaces. This could either resolve to the generic address space or the private address space, depending on whether the target supports the generic address space or not.

The newer unified OpenCL C specification is clearer in mandating that all targets must provide overloads on the explicitly qualified 'private' address space, as well as optionally defining ones on the (unqualified) generic address space. This meant that most of these math builtins were lacking one overload: either the private or generic one, depending on which target was compiling the builtins.

One notable exception here is NVIDIA, which maps the private and generic
address spaces to the same target address space. Thus declaring builtins
overloaded on these two address spaces results in a mangling clash,
which we can't have. Therefore we now say that NVIDIA targets don't
support the generic address space for the purposes of these builtins. In
reality, the builtins with the private address space are functionally
equivalent to the generic ones, so users won't notice.

For the sake of code clarity, although the 'generic' keyword is technically reserved, we know that clang defines it to be the corresponding unqualified generic address space, so we use that to be explicit. We always compile with clang so it shouldn't be a problem with portability.

With this we can also enable a LIT test for HIP, which was previously failing as it couldn't find the generic address space overloads of the fract and lgamma_r builtins.

There are other builtins that this treatment (may) need applied to, such as the vload and vstore variants. Those will be handled in a subsequent patch.

@frasercrmck frasercrmck requested review from a team as code owners March 13, 2024 18:52
@frasercrmck frasercrmck force-pushed the libclc-generic branch 2 times, most recently from 6f32ac4 to 739cc92 Compare March 19, 2024 11:00
The generic implementations of the math builtins which take pointer
arguments were using unqualified address spaces. This could either
resolve to the generic address space or the private address space,
depending on whether the target supports the generic address space or
not.

The newer unified OpenCL C specification is clearer in mandating that
all targets must provide overloads on the explicitly qualified
'private' address space, as well as optionally defining ones on the
(unqualified) generic address space. This meant that most of these math
builtins were lacking one overload: either the private or generic one,
depending on which target was compiling the builtins.

One notable exception here is NVIDIA, which maps the private and generic
address spaces to the same target address space. Thus declaring builtins
overloaded on these two address spaces results in a mangling clash,
which we can't have. Therefore we now say that NVIDIA targets don't
support the generic address space for the purposes of these builtins. In
reality, the builtins with the private address space are functionally
equivalent to the generic ones, so users won't notice.

For the sake of code clarity, although the 'generic' keyword is
technically reserved, we know that clang defines it to be the
corresponding unqualified generic address space, so we use that to be
explicit. We always compile with clang so it shouldn't be a problem with
portability.

With this we can also enable a LIT test for HIP, which was previously
failing as it couldn't find the generic address space overloads of fract
and lgamma_r.

There are other builtins that this treatment (may) need applied to, such
as the vload and vstore variants. Those will be handled in a subsequent
patch.
@npmiller
Copy link
Contributor

Friendly ping @intel/llvm-reviewers-runtime @bso-intel

@ldrumm ldrumm merged commit 9e4768c into intel:sycl Mar 21, 2024
12 checks passed
@frasercrmck frasercrmck deleted the libclc-generic branch March 21, 2024 14:15
sommerlukas pushed a commit that referenced this pull request Mar 28, 2024
…ported on Native CPU (#13109)

Similarly to what is done for `nvptx` in
#13015, Native CPU maps `private` and
`generic` to the same address spaces, so we need to avoid getting
multiple definitions for the libclc builtins that use `generic`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants