-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HIP] USM-P2P extension implementation #1234
Conversation
Uses the latest non deprecated hip APIs. fully tested and passing via DPC++ using Crusher. Signed-off-by: JackAKirk <[email protected]>
Note that proper testing of this PR cannot be made via intel/llvm CI, since a system with two gpus is required. Eventually I will open a PR to update these tests in intel/llvm. However the form of these tests depends on intel/llvm#10737 getting merged. cc @jinz2014 |
source/adapters/hip/usm_p2p.cpp
Outdated
return UR_RESULT_ERROR_UNSUPPORTED_FEATURE; | ||
UR_APIEXPORT ur_result_t UR_APICALL urUsmP2PDisablePeerAccessExp( | ||
ur_device_handle_t commandDevice, ur_device_handle_t peerDevice) { | ||
ur_result_t result = UR_RESULT_SUCCESS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use the pattern where instead of using this result
var you just return err
in the catch
and otherwise return UR_RESULT_SUCCESS
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
source/adapters/hip/usm_p2p.cpp
Outdated
return ReturnValue(uint32_t{0}); | ||
|
||
int value; | ||
hipDeviceP2PAttr hip_attr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually in the hip adapter we use UpperCamelCase for vars. You have a mix of lowerCamelCase and snake_case in this PR. Could you make consistent please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
Should we enable some tests in DPC++ to check this behaviour? Can we test for this in UR? |
See #1234 (comment) |
Aha sorry missed this |
For UR testing it is a good question. I remember some discussion around having (or moving) tests like this in UR: There could also in principle be e2e integration testing in UR, that for example could test integration of features, such as enabling p2p, and then performing usm copies via p2p. I do not know if tests like this exist or are desired in the future. |
Signed-off-by: JackAKirk <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I thought about it a bit more. I could add a test at this point and happy to do so if people want it. I think that it would require a system with two or more gpus in each platform for it to be executed, essentially the same as p2p tests in intel/llvm but limited to ur scope. I think such a test would be most sensible at the "conformance" level (although my reading is atm such tests aren't written for extensions?), such that it only uses UR apis portably and tests correct e2e behavior. Doing any kind of testing with just a single gpu would have to be platform specific I think and would only be able to test that the correct native errors are returned when you try to do any p2p with only one gpu. I don't think that would really be worthwhile since you are not really testing much without at least two gpus. |
Signed-off-by: JackAKirk <[email protected]>
There are similar tests in DPC++ to test for buffer migration https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/Basic/buffer/buffer_migrate.cpp#L25 that only test the desired behaviour if using a multi gpu system. Something similar could be done here. Therefore the tests would always pass on a single gpu system but would test the desired behaviour by default on a multi gpu system |
Yeah it would be identical to intel/llvm@sycl...JackAKirk:llvm:p2p-make-test-portable |
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #1234 +/- ##
=======================================
Coverage 15.39% 15.39%
=======================================
Files 240 240
Lines 34099 34099
Branches 3775 3775
=======================================
Hits 5250 5250
Misses 28798 28798
Partials 51 51 ☔ View full report in Codecov by Sentry. |
source/adapters/cuda/usm_p2p.cpp
Outdated
} | ||
return result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes appear unrelated. Please submit them as a separate cleanup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of the overhead involved in getting UR PRs merged I would be inclined to make an exception and keep these changes in this PR. They will still appear as a separate commit since we don't squash in UR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not a good reason. The hardest bug I ever had to solve was because of "just an extra bit of code" unrelated to the commit message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right. If we don't squash, then it's slightly better, but it still feels unhygienic to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I defer to your judgement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New PR for those format changes is here: #1311
Also I think (but not certain) in UR it should be easier to make a portable test for hip/cuda/l0, even when cuda doesn't have the multi-device context patch. I will have a go, and plan to update this PR with the test. |
Is there a plan to have tests for unified-runtime extensions conformance? atm I don't see such tests in the UR repo. I've been staring at the test folder and I think the only testing of value for this P2P extension would be a backend agnostic conformance test, but atm there isn't an existing example I can see for extensions, only core spec. If you think such a thing would be valuable I am happy to introduce such testing. However I do not want to do something new like this without consulting you first. |
These changes will be made in separate PRs. Signed-off-by: JackAKirk <[email protected]>
Experimental features are not required to have CTS tests but adding tests for them is very welcome. One of the criteria for making an experimental feature core is to have CTS tests. Please go ahead with adding P2P experimental feature tests. |
I've added a test here: #1369 |
Uses the latest non deprecated hip APIs. This enables the DPC++ hip backend to use the P2P extension. Full support is enabled.