-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gemlite fixes #1432
base: main
Are you sure you want to change the base?
Gemlite fixes #1432
Conversation
Summary: shapes need to be divisible by 128 or they will not work with gemlite need fp32 accumulation for groupsize None on int4 Test Plan: python test_integration.py -k "test_gemlite" (new test for non divisible shape)a python generate.py --checkpoint_path $CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --precision float16 --quantization gemlite-8-4-None --write_result benchmark_results.txt python generate.py --checkpoint_path $CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --precision float16 --quantization gemlite-32-4-None --write_result benchmark_results.txta (previously these gave nonsense responses) Reviewers: Subscribers: Tasks: Tags:
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1432
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New FailuresAs of commit 1797c75 with merge base 33d57af (): NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@@ -41,6 +37,14 @@ def elapsed_time(self, other_event): | |||
return abs(other_event.event_time - self.event_time) * 1000 | |||
|
|||
|
|||
def get_arch_name() -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why these changes? is this some rebase issue
Summary: Resubmitting fixes from @HDCharles in pytorch#1432 since that seems to have issues with rebase Test Plan: see pytorch#1432 Reviewers: Subscribers: Tasks: Tags:
if _layout.group_size == None and _layout.bit_width == 4: | ||
from gemlite.core import GEMLITE_ACC_DTYPE | ||
from gemlite.dtypes import DType | ||
GEMLITE_ACC_DTYPE[DType.FP16] = DType.FP32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will only work when all the layers use the same group_size, which is ok for now.
The other option will be using this https://github.com/mobiusml/gemlite/blob/master/gemlite/core.py#L87 but for now let's keep it like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this manually, it works in all cases even when there are different group sizes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean when different layers use different settings within the same model, but let's not worry about that !
Summary: Resubmitting pytorch#1432 since it has some rebase issues and we want to merge the fix asap Test Plan: see pytorch#1432 Reviewers: Subscribers: Tasks: Tags:
landed in #1435, please feel free to submit any follow up fixes |
Summary:
shapes need to be divisible by 128 or they will not work with gemlite need fp32 accumulation for groupsize None on int4
Test Plan:
python test_integration.py -k "test_gemlite" (new test for non divisible shape)a
python generate.py --checkpoint_path $CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --precision float16 --quantization gemlite-8-4-None --write_result benchmark_results.txt python generate.py --checkpoint_path
$CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --precision float16 --quantization gemlite-32-4-None --write_result benchmark_results.txta
(previously these gave nonsense responses)
Reviewers:
Subscribers:
Tasks:
Tags: