Skip to content

Commit

Permalink
Update the code base with AdamCPR and for PyTorch v2.3.1+. Add figure…
Browse files Browse the repository at this point in the history
…s and more detailed descriptions in README. Tests are not completed.
  • Loading branch information
JoergFranke committed Nov 9, 2024
1 parent ceb676d commit 82843ea
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
This repository contains the PyTorch implementation of [**Constrained Parameter Regularization**](https://arxiv.org/abs/2311.09058)(CPR) with the Adam optimizer.
CPR is an alternative to traditional weight decay. Unlike the uniform application of a single penalty, CPR enforces an upper bound on a statistical measure, such as the L2-norm, of individual parameter matrices. CPR introduces only a minor runtime overhead and only requires setting an upper bound (or does it automatically with an inflection point detection).

AdamCPR outperforms AdamW on various tasks, such as imagenet (CIFAR100 and ImageNet) or language modeling (GPT2/OpenWebText) as in the figure below.
AdamCPR outperforms AdamW on various tasks, such as image classification (CIFAR100 and ImageNet) or language modeling finetuning or pretraining (GPT2/OpenWebText) as in the figure below.

<img src="figures/gpt2s_adamw200_300_cprIP.jpg" width="390" height="240">

Expand Down

0 comments on commit 82843ea

Please sign in to comment.