You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Why is this high tolerance needed? I understand we might expect numerical differences, but I was hoping for more similar results when implementations are equivalent.
The text was updated successfully, but these errors were encountered:
Hi, I would like to understand why the comparison with the golden logits uses high absolute tolerance values, such as:
--atol=3
for Mistral-7b--atol=1.0
for Gemma2-9bwhile some models achieve more similar results with smaller tolerances:
--atol=0.2
for Llama-2-70bWhy is this high tolerance needed? I understand we might expect numerical differences, but I was hoping for more similar results when implementations are equivalent.
The text was updated successfully, but these errors were encountered: