Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port MetalKernels #131

Merged
merged 16 commits into from
Mar 28, 2023
Merged

Port MetalKernels #131

merged 16 commits into from
Mar 28, 2023

Conversation

maxwindiff
Copy link
Contributor

@maxwindiff maxwindiff commented Mar 14, 2023

Outstanding issues

  • copyto_testsuite used to fail for strange reasons, but seems consistently passing now?

Unsupported currently

@maleadt
Copy link
Member

maleadt commented Mar 14, 2023

cc @vchuravy @tgymnich

src/Metal.jl Outdated Show resolved Hide resolved
@maleadt
Copy link
Member

maleadt commented Mar 17, 2023

Can we make use of the specialized thread_position_in_threadgroup_1d and thread_position_in_grid_1d queries instead of having to multiply things (i.e., #76 and #139)?

@maxwindiff
Copy link
Contributor Author

@maleadt
Copy link
Member

maleadt commented Mar 18, 2023

Ah, right, not sure how I overlooked those.

@vchuravy
Copy link
Member

We should expose a query in KA that checks capabilities of the backend. E.g. atomics and Float64 support come to mind.

src/MetalKernels.jl Outdated Show resolved Hide resolved
src/MetalKernels.jl Outdated Show resolved Hide resolved
@maxwindiff
Copy link
Contributor Author

@vchuravy Sounds good, I'll take a stab at adding it.

By the way I have a question about specialfunctions. The test fails with this error:

julia> a = MtlArray(Float32[1.0])
1-element MtlVector{Float32}:
 1.0

julia> gamma.(a)
ERROR: InvalidIRError: compiling broadcast_kernel(Metal.mtlKernelContext, MtlDeviceVector{Float32, 1}, Val{CartesianIndices((1,))}, Base.Broadcast.Broadcasted{Metal.MtlArrayStyle{1}, Tuple{Base.OneTo{Int64}}, typeof(gamma), Tuple{Base.Broadcast.Extruded{MtlDeviceVector{Float32, 1}, Tuple{Bool}, Tuple{Int64}}}}, Int64) in world 32495 resulted in invalid LLVM IR
Reason: unsupported call to an unknown function (call to gpu_malloc)
...
Reason: unsupported call through a literal pointer (call to tgammaf)
Stacktrace:
 [1] _gamma
   @ ~/.julia/packages/SpecialFunctions/gXPNz/src/gamma.jl:578
 [2] gamma
   @ ~/.julia/packages/SpecialFunctions/gXPNz/src/gamma.jl:567
 [3] _broadcast_getindex_evalf
   @ ./broadcast.jl:670
...

The second error complains about a ccall((:tgammaf, libopenlibm), ...). How did that work for the other backends?

@maxwindiff
Copy link
Contributor Author

Oh found it - https://github.com/JuliaGPU/CUDA.jl/blob/master/src/device/intrinsics/special_math.jl
There doesn't seem to be an equivalent function in Metal however.

src/MetalKernels.jl Outdated Show resolved Hide resolved
@maxwindiff
Copy link
Contributor Author

Once JuliaGPU/KernelAbstractions.jl#369 and JuliaGPU/KernelAbstractions.jl#374 are merged, most tests should pass. There's one weird issue remaining however -- the copyto tests fails if the adapt unit test was run before it. Still trying to figure out why...

To disable unrelated tests:

diff --git a/test/kernelabstractions.jl b/test/kernelabstractions.jl
index 6542799..17c79ff 100644
--- a/test/kernelabstractions.jl
+++ b/test/kernelabstractions.jl
@@ -13,4 +13,8 @@ Testsuite.testsuite(()->MetalBackend(), "Metal", Metal, MtlArray, Metal.MtlDevic
     "Convert",           # depends on https://github.com/JuliaGPU/Metal.jl/issues/69
     "SpecialFunctions",  # no equivalent Metal intrinsics for gamma, erf, etc
     "sparse",            # not supported yet
+
+    "partition", "get_backend", "indextest", "Const", "CPU synchronization",
+    "Zero iteration space $(MetalBackend())", "return statement", "fallback test: callable types", "priority",
+    "Localmem", "Private", "Unroll", "Printing", "Compiler", "Reflection", "Examples",
 ]))

@vchuravy
Copy link
Member

Ok landed both PRs. I can tag 0.9.1

@vchuravy
Copy link
Member

There doesn't seem to be an equivalent function in Metal however.

Yeah we just need to skip that test.

@maxwindiff maxwindiff marked this pull request as ready for review March 23, 2023 05:11
@maxwindiff
Copy link
Contributor Author

I didn't change anything except updating dependencies, but now the copyto failure is gone. I guess this is ready for review. @vchuravy @tgymnich Appreciate if you can take another look, esp for the Adapt and copyto implementations.

test/kernelabstractions.jl Outdated Show resolved Hide resolved
test/kernelabstractions.jl Outdated Show resolved Hide resolved
Manifest.toml Outdated Show resolved Hide resolved
src/Metal.jl Show resolved Hide resolved
@maleadt maleadt merged commit 9d500f6 into JuliaGPU:main Mar 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants