Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Execution #5248

Open
Shankarjatav opened this issue Jan 22, 2025 · 7 comments
Open

Execution #5248

Shankarjatav opened this issue Jan 22, 2025 · 7 comments
Labels

Comments

@Shankarjatav
Copy link

Facing problem to run the PiconGPU

@PrometheusPi
Copy link
Member

Hi @Shankarjatav,
thanks for reaching out with your issue executing PIConGPU.
Could you please provide more information:

  • On what machine do you want to run PIConGPU and what hardware does it have?
  • At what stage of the compile process and/or run process do you run into problems?

Ideally, you could post your error message here.

@Shankarjatav
Copy link
Author

mpirun -n 1 picongpu -d 1 1 1 -g 128 128 128
[rpgpu001:1466223:0:1466223] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x440000e8)

@psychocoderHPC
Copy link
Member

psychocoderHPC commented Jan 22, 2025

  • Which version of PIConGPU do you use?
  • Which operating system do you use?
  • Do you run on CPU or GPU?
  • [if you use linux] Please run
    • cat /proc/cpuinfo
    • free
    • [if you use a NVIDIA GPU] run
      • nvidia-smi
  • [if you use linux] please run mpirun -n 1 env
  • Could you try mpirun -n 1 picongpu -d 1 1 1 -g 32 32 32 --periodic 1 1 1 -s 10

Please provide the complete output, do not skip any output.

@Shankarjatav
Copy link
Author

PIConGPU: 0.8.0-dev
Build-Type: Release

Third party:
OS: Linux-4.18.0-477.27.2.el8_8.x86_64
arch: x86_64
CXX: GNU (12.2.0)
CMake: 3.27.9
CUDA: 12.4.99
mallocMC: 2.6.0
Boost: 1.84.0
MPI:
standard: 3.1
flavor: MPICH (3.4a2)
PNGwriter: 0.7.0
openPMD: 0.15.0

@Shankarjatav
Copy link
Author

NVIDIA A100

@Shankarjatav
Copy link
Author

Shankarjatav commented Jan 22, 2025

mpirun -n 1 picongpu -d 1 1 1 -g 32 32 32 --periodic 1 1 1 -s 10
[rpgpu007:3422650:0:3422650] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x440000e8)
==== backtrace (tid:3422650) ====
 0  /home/apps/SPACK/spack/opt/spack/linux-almalinux8-cascadelake/gcc-12.2.0/ucx-1.16.0-hplhmgb2auitnd3t4nhocz6fyvxaqe3t/lib/libucs.so.0(ucs_handle_error+0x2c4) [0x7f351792d104]
 1  /home/apps/SPACK/spack/opt/spack/linux-almalinux8-cascadelake/gcc-12.2.0/ucx-1.16.0-hplhmgb2auitnd3t4nhocz6fyvxaqe3t/lib/libucs.so.0(+0x322c4) [0x7f351792d2c4]
 2  /home/apps/SPACK/spack/opt/spack/linux-almalinux8-cascadelake/gcc-12.2.0/ucx-1.16.0-hplhmgb2auitnd3t4nhocz6fyvxaqe3t/lib/libucs.so.0(+0x32586) [0x7f351792d586]
 3  /lib64/libpthread.so.0(+0x12cf0) [0x7f351cec8cf0]
 4  /home/apps/SPACK/spack/opt/spack/linux-almalinux8-cascadelake/gcc-12.2.0/openmpi-4.1.6-vx74yf7mr3cg7bwg6v7kzjtqsg6u2ros/lib/libmpi.so.40(PMPI_Comm_size+0x37) [0x7f3523fc7ed7]
 5  picongpu() [0x7820ef]
 6  picongpu() [0x506e25]
 7  picongpu() [0x4a80b8]
 8  picongpu() [0x45fa6d]
 9  /lib64/libc.so.6(__libc_start_main+0xe5) [0x7f351c09bd85]
10  picongpu() [0x47552e]
=================================
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 3422650 on node rpgpu007 exited on signal 11 (Segmentation fault).

@PrometheusPi
Copy link
Member

PrometheusPi commented Jan 22, 2025

Does executing mpirun -n 1 picongpu --help work properly and what does ldd picongpu returns?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants