-
Notifications
You must be signed in to change notification settings - Fork 15
Move to Alpaka Memory + Alpaka 0.9 + CMSSW Alpaka Caching Allocator #292
Move to Alpaka Memory + Alpaka 0.9 + CMSSW Alpaka Caching Allocator #292
Conversation
I fixed a small bug with the last commit where segmentsInGPU was not being deleted when the Event destructor was called at the end of a run. It led to a mismatch in the number of cudaMallocs vs cudaFrees. |
do you happen to know when it's called? before or after the TC kernel or related alloc/read/writes |
It's hanging the removeDupQuintupletsInGPUBeforeTC kernel. I have to look into it more. |
Ntuple writing runtime error was not fixed by moving to a newer version of the caching allocator unfortunately. Looking into it more, but I may open the PR for review even if I can't find a solution since it only affects ntuple writing with the caching allocator turned on. |
Turning off the half precision code allowed the compiler to find two unused variables in one of the smaller kernels. |
After speaking with @VourMa over skype we've agreed to merge this PR first into the Alpaka branch, rebase to the current master, and then do a final review of the Alpaka branch before merging it into the master branch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First set of comments is there - to be continued...
I have a general question: What are the files under code/alpaka_interface
supposed to do? Are they helper functions for the Alpaka memory management? If so, could you add a small description of what's their purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note that this file is obsolete, replaced by setup_ucsd.sh
- not sure if this was meant to be fixed in the rebase or it was missed.
Also, has the whole setup been tested in any of the UCSD machines?
Also, I see that the setup_lnx7188.sh
has switched to el8
. Could the two files be reconciled once again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since part of the Alpaka migration is the cleanup of the cpu code, is this file (and its corresponding implementation) being used anywhere or can they be removed (instead of moved)?
Sorry, wrong PR, copying to the proper one... |
This PR moves the memory management for the codebase over to Alpaka. It also removes unused memory functions in a few files, and gets rid of the unused PrintUtil files in SDL (along with a couple of unused print functions). Lastly, by including Alpaka in the main makefile it moves some of the required Alpaka statements from Event.cuh to Constants.cuh. See timing and performance plots below. Note that the timing is slower because the Alpaka caching allocator is not yet in place.