Questions about GPU workflow in gem5 #1613
-
I am a beginner studying the GPU model in gem5. I use the following command to run it:
The -c parameter specifies the bin/square file, which is an executable file and is not directly readable by humans. Could anyone help me explain these? Thank you very much. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 5 replies
-
Hi @Ronchi1997, yes -c is specifying the binary. I'm not sure any of your other questions are needed for using gem5, because they are more related to how GPGPU programming languages compile code. But the HSA/etc. packets that are sent to the command processor (CP)/packet processor (PP) are essentially how the GPGPU programming language encodes the information (e.g., run this kernel, copy this memory, etc.) to the GPU. The CP then takes these packets and performs the appropriate action on the GPU (e.g., running the kernel the packet encodes). The kernel objects specifically point to the binary that HIP (AMD's GPGPU programming language) generates (the "machine instructions"). This part is done through LLVM -- I don't know the exact details of how this works, but again you don't need to know this for running workloads in gem5 unless you want to modify the compiler itself. The wavefront size is fixed in AMD GPUs (currently 64 threads/wavefront in what gem5 supports). You don't control this -- the CP will take the information from your kernel object and convert it into N wavefronts, where N is ceil(number of threads in kernel / 64). You may also find these slides useful: https://www.gem5.org/assets/files/isca2024-tutorial/05-gpu.pdf (e.g., slides 30-33) |
Beta Was this translation helpful? Give feedback.
The details can get complicated so I'll build on the answer from @mattsinc and point to some gem5 locations.
One of the AQL packets described on slide 32 is a dispatch packet. This is defined in the HSA Runtime Programmer's Reference in Section 2.6.5.7. The "kernel_object" field is the address of a "code object." The code object is defined in gem5 in gpu-compute/kernel_code.hh based on the LLVM description in the LLVM AMDGPU User Guide. The GPU Command Processor calculates the start of the kernel code and creates an HSA task with the initial starting PC.
Next some dispatch magic happens which is a whole other topic. Eventually, individual wavefronts are started and the compute unit sets t…