Reducing CUDA Memory #20

Ryandonofrio3 · 2023-08-29T22:56:44Z

I am trying to train on some videos of Mosquitoes and am doing some preprocessing. I am running into

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 12.00 GiB total capacity; 9.91 GiB already allocated; 0 bytes free; 11.27 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I am on a 3080ti. Inside of config.py i reduced the default num of points to 20 and the chunk size to a measly 100. Yet still memory errors. Any suggestions? Ran nvidia-smi and have nothing else hogging Gpu. trying to squeeze down can't afford an A100!

The text was updated successfully, but these errors were encountered:

ShenEffort · 2023-08-30T02:55:29Z

Mosquitoes

I have the same problem, mine is 1080ti, I have changed num_pts and chunk_size to 1, but still not working. Can the author publish a small data set that can fit 12GB of video memory?

qianqianwang68 · 2023-08-30T07:24:10Z

Sorry for the confusion, num of points and chunk size will only control the memory usage for training but not preprocessing. Could you provide more details about which part of the processing code leads to the OOM error? One likely reason is lines like this consume too much memory.

Ryandonofrio3 · 2023-08-30T18:03:49Z

@qianqianwang68

ah I see. Thank you,

So yes I am able compute the Dino Features but the moment it begins doing the pairwise optical flows it instantly crashes. It needs a very large amount of memory. Can we scale this down or modify it as well in pre? Thank you for any advice you may be able to provide!

StephenZhao1 · 2024-03-11T02:51:41Z

I am trying to train on some videos of Mosquitoes and am doing some preprocessing. I am running into

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 12.00 GiB total capacity; 9.91 GiB already allocated; 0 bytes free; 11.27 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I am on a 3080ti. Inside of config.py i reduced the default num of points to 20 and the chunk size to a measly 100. Yet still memory errors. Any suggestions? Ran nvidia-smi and have nothing else hogging Gpu. trying to squeeze down can't afford an A100!

I change num_pairs to 4, it runs successfully on 3060 with 12GB memory.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reducing CUDA Memory #20

Reducing CUDA Memory #20

Ryandonofrio3 commented Aug 29, 2023

ShenEffort commented Aug 30, 2023

qianqianwang68 commented Aug 30, 2023

Ryandonofrio3 commented Aug 30, 2023 •

edited

Loading

StephenZhao1 commented Mar 11, 2024

Reducing CUDA Memory #20

Reducing CUDA Memory #20

Comments

Ryandonofrio3 commented Aug 29, 2023

ShenEffort commented Aug 30, 2023

qianqianwang68 commented Aug 30, 2023

Ryandonofrio3 commented Aug 30, 2023 • edited Loading

StephenZhao1 commented Mar 11, 2024

Ryandonofrio3 commented Aug 30, 2023 •

edited

Loading