Mixed Precision Training #113

JSabadin · 2024-10-12T09:47:13Z

Mixed Precision Training Implementation

This pull request introduces mixed precision training to improve the performance and efficiency of our model training.

Mixed precision training has been implemented, leveraging the automatic mixed precision (AMP) feature available in PyTorch.
Testing results:
- Prior to this implementation, the detection model took:
  - 10.5 minutes per epoch on a batch size of 16 (bs16)
  - 10.5 minutes per epoch on a batch size of 32 (bs32) due to high GPU utilization and full VRAM.
- After implementing mixed precision training, the detection epoch time reduced to:
  - 8.5 minutes per epoch on a batch size of 32 (bs32), providing a significant performance boost.

This implementation has only been tested on detection models. It is recommended to perform further testing on other models.
Some loss functions do not support mixed precision training. In these cases, the following should be applied:
- Use torch.amp.autocast(..., disable=True) to disable AMP for specific loss functions.

Reduced training time: By utilizing half-precision for certain operations, training time has been significantly reduced while maintaining model accuracy.
Better hardware utilization: Improved GPU memory usage, allowing for larger batch sizes without running into VRAM issues.

…ue to performance drop.

kozlov721

LGTM

JSabadin added 2 commits October 9, 2024 16:17

reduced vram usage

dc7ae5f

Added mixed precision training and removed torch.cuda.empty_cache() d…

d839190

…ue to performance drop.

JSabadin requested a review from a team as a code owner October 12, 2024 09:47

JSabadin requested review from kozlov721, klemen1999, tersekmatija and conorsim and removed request for a team October 12, 2024 09:47

github-actions bot assigned JSabadin Oct 12, 2024

github-actions bot added fix Fixing a bug release New version release labels Oct 12, 2024

kozlov721 changed the title ~~Fix/lower vram usage~~ Mixed Precision Training Oct 14, 2024

kozlov721 approved these changes Oct 29, 2024

View reviewed changes

klemen1999 merged commit 1d83a2c into main Oct 31, 2024
5 of 6 checks passed

klemen1999 deleted the fix/lower-vram-usage branch October 31, 2024 09:24