-
Notifications
You must be signed in to change notification settings - Fork 517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix LTC build error in CI #3910
Comments
This commit disables the LTC build from the Torch-MLIR CI since after the recent GH runner version upgrade the Torch-MLIR build in CI is failing with an LTC related error. The tracking issue for the same can be found here: llvm#3910 Signed-off-by: Vivek Khandelwal <[email protected]>
@vivekkhandelwal1 could you share the range of commits between passing and failed CI runs? Is it from PyTorch version update? That would help us narrow down in root cause. The error seems from link with undefined symbol:
|
Hi @ke1337, the error is not because of the PyTorch version update. You can see any of the latest PR CI run it will fail with this error independent of the changes made in the PR. Also, the same PR's CI was passing before the gh runner version upgrade, but as soon as that update was done it started failing. Before that update the CI was not even running all the jobs were indefinitely queued. |
What was the purpose of the GH runner version upgrade? Could there be an issue with GH runner itself? Can we roll back the GH runner version and upgrade to a version that doesn't have this issue? |
This commit disables the LTC build from the Torch-MLIR CI since after the recent GH runner version upgrade the Torch-MLIR build in CI is failing with an LTC related error. The tracking issue for the same can be found here: #3910 Signed-off-by: Vivek Khandelwal <[email protected]>
@saienduri can tell you about this. |
@saienduri a gentle reminder, in addition to the above queries, can you also plz let us know if there are any logs generated related to GH runner upgrade. |
Hello, we have to upgrade the GH runner version because they deprecated the old one. What logs are you looking for? The undefined symbol error linked above is the one we have to get past to enable LTC again |
Hi - the current CI bots are managed by my team and we can only provide limited support for features we do not use (basically happy to answer easy questions and if not causing problems, happy to run some additional configs). But LTC has a complex relationship with pytorch which is different from everything else in the repo. I would recommend bringing up your own runners if needing to support LTC. As Sai says, GH forces upgrades of runners and it is non optional. Tracking down this kind of thing is very costly, and that cost needs to be handled by folks who use LTC. It will need to be tolerant of breakages because we have to keep both the pytorch and runner version upgraded and this seems to be a fragile integration. |
After the recent GH runner version upgrade the Torch-MLIR build in CI is failing with some LTC related error. The error can be found here: https://github.com/llvm/torch-mlir/actions/runs/12138622225/job/33891848916#step:6:1.
Since the error is not yet fixed and all the PRs are blocked on this, I'm disabling the LTC build from the CI and this issue will keep track the progress related to the fix.
CC: @antoniojkim @ke1337
The text was updated successfully, but these errors were encountered: