-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new quantization scheme #1100
Comments
We already support Po2 quantization in Brevitas, sometimes referred internally as FixedPoint. We have some presents here: |
I see, let me explain my situation. I have a weight distribution where most of the values are close to ZERO, I tried INT quantization, but the result was really poor, because most of the values fall in ZERO, therefore I tried the Currently, I have two ideas:
What do you think? Have you ever experienced something like that? NOTE: I am using channel-wise quantization |
I am assuming that you're using PTQ. You can try to use some of our PTQ techniques like weight equalization, which should help remove outliers. The idea is the following:
This might take a bit of time to set-up and make sure that everything works as intended, because weight-equalization is not compatible with all network topologies. There are also methods but honestly it is very application specific, and it could be worth reaching out offline to discuss those if you're still facing issues. With respect to non-uniform quantization, Brevitas does not currently support it. Moreover, non-uniform quantization is generally not convenient for the use-case you mentioned in another PR since it might require dequantization to perform operations (that is not always true and it's algorithm specific) |
Apologies for my miss-understanding about Power-of-Two quantization, for some reasons I completely missed the link. In general we're looking into some Non-Uniform quantization options. If there's a relatively easy way to get that implemented and you're willing to contribute, I am happy to help in the process :) |
Amazing, I think it could be a nice new feature for Brevitas due to its flexibility and easy HW implementation. Let me know what do you think ;) |
Do you think it's feasible to add Additive Power OF Two Quantization to Brevitas?
Even if it is known as non-uniform quantization technique, it is so HW friendly and it can help when a we need more flexibility of representation.
I can try to do it, I just would like to know what do you think!
The text was updated successfully, but these errors were encountered: