Illegal Memory Access Error During Gradient Calculation of predefined losses on GPU RTX 4050 #2361

yolhan83 · 2023-12-17T22:36:32Z

I'm experiencing a problem with gradient calculations on a GPU using Flux.jl. Below is a minimal example that demonstrates the issue:

using Flux,CUDA,cuDNN; x = rand(Float32,1,1000) |> gpu; y = rand(Float32,1,1000) |> gpu; model = Flux.Chain( Flux.Dense(1,10,tanh), Flux.Dense(10,10,tanh), Flux.Dense(10,1) ) |> gpu; loss(model,x,y) = Flux.mse(model(x),y); loss(model,x,y); gradient(loss,model,x,y)

When executing the gradient calculation (last line), I encounter the following error:

ERROR: CUDA error: an illegal memory access was encountered (code 700, ERROR_ILLEGAL_ADDRESS) ERROR: WARNING: Error while freeing DeviceBuffer(3.906 KiB at 0x000000020502c800): CUDA.CuError(code=CUDA.cudaError_enum(0x000002bc), details=CUDA.Optional{String}(data=nothing))

Interestingly, when I modify the loss function to the following, the error no longer occurs, and the code runs as expected:

loss(model, x, y) = norm(model(x) .- y) ./ size(y, 2)

Given this behavior, I suspect there might be an issue specifically related to the Flux.Losses function (tested for mse and crossententropy, both fail) or its interaction with CUDA.jl when calculating gradients.

I've ensured that all libraries (Flux.jl, CUDA.jl, cuDNN) and drivers are up to date. The error persists despite various attempts to debug and isolate the issue.

Any insights or suggestions on how to address this problem would be greatly appreciated.

Thank you for your assistance.
julia : 1.9.4
[052768ef] CUDA v5.1.1
[587475ba] Flux v0.14.7
[02a925ec] cuDNN v1.2.1

The text was updated successfully, but these errors were encountered:

ToucheSir · 2023-12-17T23:48:56Z

mse uses mean, so this is a duplicate of FluxML/Zygote.jl#1473. As I mentioned on Slack, please give the troubleshooting instructions in that issue a try to help us figure out what's going on with CUDA + Zygote.

ToucheSir closed this as not planned Won't fix, can't repro, duplicate, stale Dec 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Illegal Memory Access Error During Gradient Calculation of predefined losses on GPU RTX 4050 #2361

Illegal Memory Access Error During Gradient Calculation of predefined losses on GPU RTX 4050 #2361

yolhan83 commented Dec 17, 2023 •

edited

Loading

ToucheSir commented Dec 17, 2023

Illegal Memory Access Error During Gradient Calculation of predefined losses on GPU RTX 4050 #2361

Illegal Memory Access Error During Gradient Calculation of predefined losses on GPU RTX 4050 #2361

Comments

yolhan83 commented Dec 17, 2023 • edited Loading

ToucheSir commented Dec 17, 2023

yolhan83 commented Dec 17, 2023 •

edited

Loading