Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with QM Calculations Terminating Prematurely When Assigned to Individual GPUs #154

Open
ORCAaAaA-ui opened this issue May 15, 2024 · 3 comments · Fixed by #158
Open

Comments

@ORCAaAaA-ui
Copy link

I am experiencing an issue with running QM calculations on individual GPUs. When assigning QM calculations to separate GPUs, the calculations terminate prematurely without completing. This issue does not occur when running the calculations on a single GPU. I have ensured that each GPU has sufficient memory.

Any recommended steps to further diagnose and resolve this issue?

I appreciate any assistance or guidance on this issue. Thank you.

@wxj6000
Copy link
Collaborator

wxj6000 commented May 22, 2024

@ORCAaAaA-ui How do you assign QM calculations to separate GPUs? Can you share the script here for diagnosing?

@wxj6000 wxj6000 linked a pull request May 26, 2024 that will close this issue
@wxj6000
Copy link
Collaborator

wxj6000 commented May 26, 2024

@ORCAaAaA-ui There are at least two ways to select individual GPU. 1) Use docker. You can specify which GPU is visible when you run docker run. 2) Use CuPy. https://docs.cupy.dev/en/stable/reference/generated/cupy.cuda.Device.html But you can only import gpu4pyscf modules when the device is selected.

When the above PR is merged, one can import GPU4PySCF before the device is selected.

@wxj6000 wxj6000 reopened this May 28, 2024
@ORCAaAaA-ui
Copy link
Author

@wxj6000 I simply excuted the command, 'export CUDA_VISIBLE_DEVICES=0 or 1', to assign jobs to each GPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants