Skip to content

cms-patatrack/test-dynamic-parallelism

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Instructions

  • edit the Makefile to point CUDA_BASE to a CUDA installation (e.g. `CUDA_BASE = /usr/local/cuda')
  • run make to build all binaries:
    $ make
    /usr/local/cuda/bin/nvcc --generate-line-info --source-in-ptx --expt-relaxed-constexpr --expt-extended-lambda -std=c++14 -O2 --cudart=shared -gencode arch=compute_50,code=sm_50 --compiler-options "-O2 -pthread -fPIC" -dc foo.cu -o foo.o
    /usr/local/cuda/bin/nvcc --generate-line-info --source-in-ptx --expt-relaxed-constexpr --expt-extended-lambda -std=c++14 -O2 --cudart=shared -gencode arch=compute_50,code=sm_50 --compiler-options "-O2 -pthread -fPIC" -dlink foo.o -o foo_dlink.o
    g++ -O2 -pthread -fPIC --shared foo.o foo_dlink.o -L/usr/local/cuda/lib64 -lcudart -lcudadevrt -Wl,-rpath -Wl,/usr/local/cuda/lib64 -o libfoo.so
    /usr/local/cuda/bin/nvcc --generate-line-info --source-in-ptx --expt-relaxed-constexpr --expt-extended-lambda -std=c++14 -O2 --cudart=shared -gencode arch=compute_50,code=sm_50 --compiler-options "-O2 -pthread -fPIC" -dc bar.cu -o bar.o
    /usr/local/cuda/bin/nvcc --generate-line-info --source-in-ptx --expt-relaxed-constexpr --expt-extended-lambda -std=c++14 -O2 --cudart=shared -gencode arch=compute_50,code=sm_50 --compiler-options "-O2 -pthread -fPIC" -dlink bar.o -o bar_dlink.o
    g++ -O2 -pthread -fPIC --shared bar.o bar_dlink.o -L/usr/local/cuda/lib64 -lcudart -lcudadevrt -Wl,-rpath -Wl,/usr/local/cuda/lib64 -o libbar.so
    g++ -O2 -pthread -fPIC main.cc -L. -lfoo -ldl -Wl,-rpath -Wl,. -L/usr/local/cuda/lib64 -lcudart -lcudadevrt -Wl,-rpath -Wl,/usr/local/cuda/lib64 -o test
    /usr/local/cuda/bin/nvcc --generate-line-info --source-in-ptx --expt-relaxed-constexpr --expt-extended-lambda -std=c++14 -O2 --cudart=shared -gencode arch=compute_50,code=sm_50 --compiler-options "-O2 -pthread -fPIC" -DMAY_CRASH -dc bar.cu -o bar_crash.o
    /usr/local/cuda/bin/nvcc --generate-line-info --source-in-ptx --expt-relaxed-constexpr --expt-extended-lambda -std=c++14 -O2 --cudart=shared -gencode arch=compute_50,code=sm_50 --compiler-options "-O2 -pthread -fPIC" -dlink bar_crash.o -o bar_crash_dlink.o
    g++ -O2 -pthread -fPIC --shared bar_crash.o bar_crash_dlink.o -L/usr/local/cuda/lib64 -lcudart -lcudadevrt -Wl,-rpath -Wl,/usr/local/cuda/lib64 -o libbar_crash.so
    g++ -DMAY_CRASH -O2 -pthread -fPIC main.cc -L. -lfoo -ldl -Wl,-rpath -Wl,. -L/usr/local/cuda/lib64 -lcudart -lcudadevrt -Wl,-rpath -Wl,/usr/local/cuda/lib64 -o crash
  • run test:
    $ ./test
    Hello
    Loaded
    bar
    
  • run crash:
    $ ./crash
    Hello
    Loaded
    bar.cu, line 24: cudaErrorInvalidDeviceFunction: invalid device function
    

Description

main() does

  • call fooWrapper() from foo.cc, which in turns launches a CUDA kernel foo<<<1,1>>>()
  • dynamically loads a shared library libbar.so
  • upon loading, libbar.so calls wrapper() which in turn calls a CUDA kernel bar<<<1,1>>>()
  • if, inside libbar.so, there is another kernel that makes use of dynamic parallelism (e.g. it calls bar<<<1,1>>>() and cudaDeviceSynchronize()), the call to bar<<<1,1>>>() will fail even if the kernel with dynamic parallelism is never called

About

Test case for dynamic parallelism

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published