Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HIP] USM-P2P extension implementation #1234

Closed
wants to merge 6 commits into from

Conversation

JackAKirk
Copy link
Contributor

Uses the latest non deprecated hip APIs. This enables the DPC++ hip backend to use the P2P extension. Full support is enabled.

Uses the latest non deprecated hip APIs.

fully tested and passing via DPC++ using Crusher.

Signed-off-by: JackAKirk <[email protected]>
@JackAKirk JackAKirk requested a review from a team as a code owner January 8, 2024 16:40
@JackAKirk JackAKirk requested a review from hdelan January 8, 2024 16:40
@JackAKirk
Copy link
Contributor Author

Note that proper testing of this PR cannot be made via intel/llvm CI, since a system with two gpus is required.
However both MI250x and MI100 systems have been testing locally using the updated tests here:
intel/llvm@sycl...JackAKirk:llvm:p2p-make-test-portable

Eventually I will open a PR to update these tests in intel/llvm. However the form of these tests depends on intel/llvm#10737 getting merged.

cc @jinz2014

@JackAKirk JackAKirk changed the title [HIP] USM-P2P extension implementation for hip. [HIP] USM-P2P extension implementation Jan 8, 2024
return UR_RESULT_ERROR_UNSUPPORTED_FEATURE;
UR_APIEXPORT ur_result_t UR_APICALL urUsmP2PDisablePeerAccessExp(
ur_device_handle_t commandDevice, ur_device_handle_t peerDevice) {
ur_result_t result = UR_RESULT_SUCCESS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use the pattern where instead of using this result var you just return err in the catch and otherwise return UR_RESULT_SUCCESS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

return ReturnValue(uint32_t{0});

int value;
hipDeviceP2PAttr hip_attr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually in the hip adapter we use UpperCamelCase for vars. You have a mix of lowerCamelCase and snake_case in this PR. Could you make consistent please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

@hdelan
Copy link
Contributor

hdelan commented Jan 9, 2024

Should we enable some tests in DPC++ to check this behaviour? Can we test for this in UR?

@JackAKirk
Copy link
Contributor Author

Should we enable some tests in DPC++ to check this behaviour? Can we test for this in UR?

See #1234 (comment)

@hdelan
Copy link
Contributor

hdelan commented Jan 9, 2024

See #1234 (comment)

Aha sorry missed this

@JackAKirk
Copy link
Contributor Author

JackAKirk commented Jan 9, 2024

Should we enable some tests in DPC++ to check this behaviour? Can we test for this in UR?

See #1234 (comment)

For UR testing it is a good question. I remember some discussion around having (or moving) tests like this in UR:
https://github.com/JackAKirk/llvm/blob/4ab6215b48d80376670200cdd98216964609d530/sycl/unittests/Extensions/USMP2P.cpp

There could also in principle be e2e integration testing in UR, that for example could test integration of features, such as enabling p2p, and then performing usm copies via p2p. I do not know if tests like this exist or are desired in the future.

@JackAKirk JackAKirk requested a review from a team as a code owner January 9, 2024 15:37
Copy link
Contributor

@hdelan hdelan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@JackAKirk
Copy link
Contributor Author

JackAKirk commented Jan 9, 2024

Should we enable some tests in DPC++ to check this behaviour? Can we test for this in UR?

See #1234 (comment)

For UR testing it is a good question. I remember some discussion around having (or moving) tests like this in UR: https://github.com/JackAKirk/llvm/blob/4ab6215b48d80376670200cdd98216964609d530/sycl/unittests/Extensions/USMP2P.cpp

There could also in principle be e2e integration testing in UR, that for example could test integration of features, such as enabling p2p, and then performing usm copies via p2p. I do not know if tests like this exist or are desired in the future.

I thought about it a bit more. I could add a test at this point and happy to do so if people want it. I think that it would require a system with two or more gpus in each platform for it to be executed, essentially the same as p2p tests in intel/llvm but limited to ur scope. I think such a test would be most sensible at the "conformance" level (although my reading is atm such tests aren't written for extensions?), such that it only uses UR apis portably and tests correct e2e behavior. Doing any kind of testing with just a single gpu would have to be platform specific I think and would only be able to test that the correct native errors are returned when you try to do any p2p with only one gpu. I don't think that would really be worthwhile since you are not really testing much without at least two gpus.

Signed-off-by: JackAKirk <[email protected]>
@hdelan
Copy link
Contributor

hdelan commented Jan 9, 2024

There are similar tests in DPC++ to test for buffer migration https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/Basic/buffer/buffer_migrate.cpp#L25 that only test the desired behaviour if using a multi gpu system. Something similar could be done here. Therefore the tests would always pass on a single gpu system but would test the desired behaviour by default on a multi gpu system

@JackAKirk
Copy link
Contributor Author

JackAKirk commented Jan 9, 2024

There are similar tests in DPC++ to test for buffer migration https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/Basic/buffer/buffer_migrate.cpp#L25 that only test the desired behaviour if using a multi gpu system. Something similar could be done here. Therefore the tests would always pass on a single gpu system but would test the desired behaviour by default on a multi gpu system

Yeah it would be identical to intel/llvm@sycl...JackAKirk:llvm:p2p-make-test-portable
but with UR apis instead of DPC++ ones.

@codecov-commenter
Copy link

codecov-commenter commented Jan 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (8c604b0) 15.39% compared to head (e3a1b47) 15.39%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1234   +/-   ##
=======================================
  Coverage   15.39%   15.39%           
=======================================
  Files         240      240           
  Lines       34099    34099           
  Branches     3775     3775           
=======================================
  Hits         5250     5250           
  Misses      28798    28798           
  Partials       51       51           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

}
return result;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes appear unrelated. Please submit them as a separate cleanup

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the overhead involved in getting UR PRs merged I would be inclined to make an exception and keep these changes in this PR. They will still appear as a separate commit since we don't squash in UR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not a good reason. The hardest bug I ever had to solve was because of "just an extra bit of code" unrelated to the commit message.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right. If we don't squash, then it's slightly better, but it still feels unhygienic to me

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I defer to your judgement

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New PR for those format changes is here: #1311

@JackAKirk JackAKirk marked this pull request as draft February 2, 2024 12:33
@JackAKirk
Copy link
Contributor Author

JackAKirk commented Feb 2, 2024

There are similar tests in DPC++ to test for buffer migration https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/Basic/buffer/buffer_migrate.cpp#L25 that only test the desired behaviour if using a multi gpu system. Something similar could be done here. Therefore the tests would always pass on a single gpu system but would test the desired behaviour by default on a multi gpu system

Yeah it would be identical to intel/[email protected]:llvm:p2p-make-test-portable but with UR apis instead of DPC++ ones.

Also I think (but not certain) in UR it should be easier to make a portable test for hip/cuda/l0, even when cuda doesn't have the multi-device context patch. I will have a go, and plan to update this PR with the test.
It may also have value as a demonstration to openmp devs on how to use the APIs.

@JackAKirk
Copy link
Contributor Author

@kbenzie

Is there a plan to have tests for unified-runtime extensions conformance? atm I don't see such tests in the UR repo. I've been staring at the test folder and I think the only testing of value for this P2P extension would be a backend agnostic conformance test, but atm there isn't an existing example I can see for extensions, only core spec. If you think such a thing would be valuable I am happy to introduce such testing. However I do not want to do something new like this without consulting you first.

These changes will be made in separate PRs.

Signed-off-by: JackAKirk <[email protected]>
@kbenzie
Copy link
Contributor

kbenzie commented Feb 12, 2024

@kbenzie

Is there a plan to have tests for unified-runtime extensions conformance? atm I don't see such tests in the UR repo. I've been staring at the test folder and I think the only testing of value for this P2P extension would be a backend agnostic conformance test, but atm there isn't an existing example I can see for extensions, only core spec. If you think such a thing would be valuable I am happy to introduce such testing. However I do not want to do something new like this without consulting you first.

Experimental features are not required to have CTS tests but adding tests for them is very welcome. One of the criteria for making an experimental feature core is to have CTS tests. Please go ahead with adding P2P experimental feature tests.

@JackAKirk
Copy link
Contributor Author

I've added a test here: #1369
This PR is now blocked by that test PR being merged. Once this happens I will sync this PR with main and add the device extension string to the hip backend, thus enabling the test for the hip backend also.

@JackAKirk JackAKirk added the v0.9.x Include in the v0.9.x release label Mar 12, 2024
@JackAKirk JackAKirk closed this Mar 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v0.9.x Include in the v0.9.x release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants