forked from pytorch/ao
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update quantize.py to use AO's int4 quantizer (pytorch#919)
* Use ao's int4 quantizer * Point AO to commit hash of Jerry's fix * When device is cuda, only run for dtype==bfloat16 Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Typo Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Use tensor subclass for int4 weight only quant * Fix bug * Fix * Use both quantizer and subclass API * Bug * unwrap tensor subclass for aoti * Add import * Eval fix * Evaluate AOTI --------- Co-authored-by: Mengwei Liu <[email protected]>
- Loading branch information
1 parent
53344db
commit 87798fd
Showing
2 changed files
with
65 additions
and
146 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters