Why logit checker has such a high tolerance? #1021

hugoabonizio · 2024-11-11T14:01:29Z

Hi, I would like to understand why the comparison with the golden logits uses high absolute tolerance values, such as:

--atol=3 for Mistral-7b
--atol=1.0 for Gemma2-9b

while some models achieve more similar results with smaller tolerances:

--atol=0.2 for Llama-2-70b

Why is this high tolerance needed? I understand we might expect numerical differences, but I was hoping for more similar results when implementations are equivalent.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why logit checker has such a high tolerance? #1021

Why logit checker has such a high tolerance? #1021

hugoabonizio commented Nov 11, 2024

Why logit checker has such a high tolerance? #1021

Why logit checker has such a high tolerance? #1021

Comments

hugoabonizio commented Nov 11, 2024