-
Notifications
You must be signed in to change notification settings - Fork 10.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ggml : fix more imatrix nan cases #11773
base: master
Are you sure you want to change the base?
Conversation
@bartowski1182 I only tested the 7B model, not sure if the issue with the 13B model is the same. |
@@ -384,7 +384,7 @@ static float make_qx_quants(int n, int nmax, const float * restrict x, int8_t * | |||
float ax = fabsf(x[i]); | |||
if (ax > amax) { amax = ax; max = x[i]; } | |||
} | |||
if (amax < GROUP_MAX_EPS) { // all zero | |||
if (fabsf(amax) < GROUP_MAX_EPS) { // all zero |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't we already use fabsf
just above. How is this extra fabsf
supposed to help?
@@ -3021,7 +3021,7 @@ static void quantize_row_iq2_xxs_impl(const float * restrict x, void * restrict | |||
} | |||
float max = xval[0]; | |||
for (int i = 1; i < 32; ++i) max = MAX(max, xval[i]); | |||
if (max < GROUP_MAX_EPS) { | |||
if (fabsf(max) < GROUP_MAX_EPS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
xval
contains the absolute values of the model weights, so how is this extra fabsf
supposed to help?
Perhaps the following will help you to actually fix Let's denote the model weights in a block with where the sum is over a quantization block. It doesn't matter how many To know what to do when (1) is not satisfied, one should first check if If (2) is not satisfied, we know that all model weights in the block are zero, so we can simply set all quants to zero and proceed with the next block. If (2) is satisfied but (1) is not, it means that a) all imatrix values in the block are zero, or, the more tricky one, (b) non-zero imatrix values happen to coincide with zero model weights. In that case, the responsible thing to do would be to abort the quantization and tell the user to go fix their imatrix. But if this is considered inadequate for the many non-/semi-technical users of |
Fixes #11764
The changes to other eps comparisons to check for the abs max value are not directly related to this issue, but I believe that was also incorrect.