Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIPCC_BIN_DIR auto detects local machine /opt/rocm #21

Open
stellaraccident opened this issue Jan 22, 2025 · 1 comment
Open

HIPCC_BIN_DIR auto detects local machine /opt/rocm #21

stellaraccident opened this issue Jan 22, 2025 · 1 comment

Comments

@stellaraccident
Copy link
Contributor

Within clr, some spurious CMake ends up setting what I suspect was meant to be a local variable of HIPCC_BIN_DIR in an attempt to auto-detect HIP_PLATFORM. This can cause several validation errors that end up being meaningless if the HIP_PLATFORM was set manually (as we do). The case gets worse because after this point hipamd gets included which actually defines the cache variable for HIPCC_BIN_DIR. But because CMake scoping is what it is, this can result in some very surprising behaviors cropping up.

I'm adding a patch locally which disables this faulty auto detection for HIP_PLATFORM if it is already defined. And I'm adding a post hook that ensures that noting ends up setting HIPCC_BIN_DIR. The latter is just insurance because this variable is broken in a number of ways that can cause containment leaks with respect to local system state.

@stellaraccident
Copy link
Contributor Author

See #20

stellaraccident added a commit to stellaraccident/clr that referenced this issue Jan 22, 2025
This code appears to be old copy-pasta and is effectively just a no-op if -DHIP_PLATFORM=amd is passed explicitly. It has a number of nit-picky checks which will fail in this case for no real reason.

This is actually worse than that because it leaks global variables that establish a different default value/scoping vs the definitions in the inner hipamd/CMakeLists.txt. I opted to rewrite this section to scope the detection logic and keep it from leaking with respect to the inner build file. The detection logic is still not great but at least will not actively cause damage or make control flow hard to reason about. It seems like this is for legacy/compatibility so I opted to simply sequester it vs applying more diligence to it.

Also fixes a bug in hip-config.cmake where it enforces that hipcc exists even if not taking the install branch in the code above.

See: nod-ai/TheRock#21
stellaraccident added a commit to stellaraccident/clr that referenced this issue Jan 22, 2025
This code appears to be old copy-pasta and is effectively just a no-op if -DHIP_PLATFORM=amd is passed explicitly. It has a number of nit-picky checks which will fail in this case for no real reason.

This is actually worse than that because it leaks global variables that establish a different default value/scoping vs the definitions in the inner hipamd/CMakeLists.txt. I opted to rewrite this section to scope the detection logic and keep it from leaking with respect to the inner build file. The detection logic is still not great but at least will not actively cause damage or make control flow hard to reason about. It seems like this is for legacy/compatibility so I opted to simply sequester it vs applying more diligence to it.

Also fixes a bug in hip-config.cmake where it enforces that hipcc exists even if not taking the install branch in the code above.

See: nod-ai/TheRock#21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant