You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm experiencing a problem with gradient calculations on a GPU using Flux.jl. Below is a minimal example that demonstrates the issue:
using Flux,CUDA,cuDNN; x = rand(Float32,1,1000) |> gpu; y = rand(Float32,1,1000) |> gpu; model = Flux.Chain( Flux.Dense(1,10,tanh), Flux.Dense(10,10,tanh), Flux.Dense(10,1) ) |> gpu; loss(model,x,y) = Flux.mse(model(x),y); loss(model,x,y); gradient(loss,model,x,y)
When executing the gradient calculation (last line), I encounter the following error:
ERROR: CUDA error: an illegal memory access was encountered (code 700, ERROR_ILLEGAL_ADDRESS) ERROR: WARNING: Error while freeing DeviceBuffer(3.906 KiB at 0x000000020502c800): CUDA.CuError(code=CUDA.cudaError_enum(0x000002bc), details=CUDA.Optional{String}(data=nothing))
Interestingly, when I modify the loss function to the following, the error no longer occurs, and the code runs as expected:
Given this behavior, I suspect there might be an issue specifically related to the Flux.Losses function (tested for mse and crossententropy, both fail) or its interaction with CUDA.jl when calculating gradients.
I've ensured that all libraries (Flux.jl, CUDA.jl, cuDNN) and drivers are up to date. The error persists despite various attempts to debug and isolate the issue.
Any insights or suggestions on how to address this problem would be greatly appreciated.
Thank you for your assistance.
julia : 1.9.4
[052768ef] CUDA v5.1.1
[587475ba] Flux v0.14.7
[02a925ec] cuDNN v1.2.1
The text was updated successfully, but these errors were encountered:
mse uses mean, so this is a duplicate of FluxML/Zygote.jl#1473. As I mentioned on Slack, please give the troubleshooting instructions in that issue a try to help us figure out what's going on with CUDA + Zygote.
I'm experiencing a problem with gradient calculations on a GPU using Flux.jl. Below is a minimal example that demonstrates the issue:
using Flux,CUDA,cuDNN; x = rand(Float32,1,1000) |> gpu; y = rand(Float32,1,1000) |> gpu; model = Flux.Chain( Flux.Dense(1,10,tanh), Flux.Dense(10,10,tanh), Flux.Dense(10,1) ) |> gpu; loss(model,x,y) = Flux.mse(model(x),y); loss(model,x,y); gradient(loss,model,x,y)
When executing the gradient calculation (last line), I encounter the following error:
ERROR: CUDA error: an illegal memory access was encountered (code 700, ERROR_ILLEGAL_ADDRESS) ERROR: WARNING: Error while freeing DeviceBuffer(3.906 KiB at 0x000000020502c800): CUDA.CuError(code=CUDA.cudaError_enum(0x000002bc), details=CUDA.Optional{String}(data=nothing))
Interestingly, when I modify the loss function to the following, the error no longer occurs, and the code runs as expected:
loss(model, x, y) = norm(model(x) .- y) ./ size(y, 2)
Given this behavior, I suspect there might be an issue specifically related to the Flux.Losses function (tested for mse and crossententropy, both fail) or its interaction with CUDA.jl when calculating gradients.
I've ensured that all libraries (Flux.jl, CUDA.jl, cuDNN) and drivers are up to date. The error persists despite various attempts to debug and isolate the issue.
Any insights or suggestions on how to address this problem would be greatly appreciated.
Thank you for your assistance.
julia : 1.9.4
[052768ef] CUDA v5.1.1
[587475ba] Flux v0.14.7
[02a925ec] cuDNN v1.2.1
The text was updated successfully, but these errors were encountered: