Implement a simple ping-pong test for GPU-to-GPU communication using: a) indirect communication via the host, and b) direct communication with HIP-aware MPI.
The ping-pong test constists of the following steps:
- Send a vector from one GPU to another
- The receiving GPU should increment all elements of the vector by one
- Send the vector back to the original GPU
For reference, there is also a CPU-to-CPU implementation in the skeleton code (ping-pong.cpp). Timing of all tests is also included to compare the execution times.
To compile, just load the required modules and type make
. On Puhti, a HIP-aware MPI is
available with:
ml openmpi/4.1.4-cuda
For running, one should use two GPUs and two MPI processes.