Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gemlite fixes #1432

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Gemlite fixes #1432

wants to merge 1 commit into from

Conversation

HDCharles
Copy link
Contributor

Summary:

shapes need to be divisible by 128 or they will not work with gemlite need fp32 accumulation for groupsize None on int4

Test Plan:

python test_integration.py -k "test_gemlite" (new test for non divisible shape)a

python generate.py --checkpoint_path $CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --precision float16 --quantization gemlite-8-4-None --write_result benchmark_results.txt python generate.py --checkpoint_path
$CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --precision float16 --quantization gemlite-32-4-None --write_result benchmark_results.txta

(previously these gave nonsense responses)

Reviewers:

Subscribers:

Tasks:

Tags:

Summary:

shapes need to be divisible by 128 or they will not work with gemlite
need fp32 accumulation for groupsize None on int4

Test Plan:

python test_integration.py -k "test_gemlite" (new test for non divisible
shape)a

python generate.py --checkpoint_path $CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --precision float16 --quantization gemlite-8-4-None  --write_result benchmark_results.txt
python generate.py --checkpoint_path
$CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --precision float16
--quantization gemlite-32-4-None  --write_result benchmark_results.txta

(previously these gave nonsense responses)

Reviewers:

Subscribers:

Tasks:

Tags:
Copy link

pytorch-bot bot commented Dec 18, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1432

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 1797c75 with merge base 33d57af (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@HDCharles HDCharles requested a review from jerryzh168 December 18, 2024 14:39
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 18, 2024
@HDCharles HDCharles requested a review from mobicham December 18, 2024 14:39
@@ -41,6 +37,14 @@ def elapsed_time(self, other_event):
return abs(other_event.event_time - self.event_time) * 1000


def get_arch_name() -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why these changes? is this some rebase issue

jerryzh168 added a commit to jerryzh168/ao that referenced this pull request Dec 18, 2024
Summary:
Resubmitting fixes from @HDCharles in pytorch#1432
since that seems to have issues with rebase

Test Plan:
see pytorch#1432

Reviewers:

Subscribers:

Tasks:

Tags:
if _layout.group_size == None and _layout.bit_width == 4:
from gemlite.core import GEMLITE_ACC_DTYPE
from gemlite.dtypes import DType
GEMLITE_ACC_DTYPE[DType.FP16] = DType.FP32
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only work when all the layers use the same group_size, which is ok for now.
The other option will be using this https://github.com/mobiusml/gemlite/blob/master/gemlite/core.py#L87 but for now let's keep it like this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this manually, it works in all cases even when there are different group sizes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean when different layers use different settings within the same model, but let's not worry about that !

jerryzh168 added a commit to jerryzh168/ao that referenced this pull request Dec 18, 2024
Summary:
Resubmitting pytorch#1432 since it has some rebase issues and
we want to merge the fix asap

Test Plan:
see pytorch#1432

Reviewers:

Subscribers:

Tasks:

Tags:
jerryzh168 added a commit that referenced this pull request Dec 18, 2024
* [resubmit] Gemlite fix

Summary:
Resubmitting #1432 since it has some rebase issues and
we want to merge the fix asap

Test Plan:
see #1432

Reviewers:

Subscribers:

Tasks:

Tags:

* ruff
@jerryzh168
Copy link
Contributor

landed in #1435, please feel free to submit any follow up fixes

amdfaa pushed a commit that referenced this pull request Jan 10, 2025
* [resubmit] Gemlite fix

Summary:
Resubmitting #1432 since it has some rebase issues and
we want to merge the fix asap

Test Plan:
see #1432

Reviewers:

Subscribers:

Tasks:

Tags:

* ruff
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants