You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I’ve noticed mismatches between the outputs of a PyTorch model and the corresponding ONNX model when running inference with ONNX Runtime. Specifically, I’m working with float16 precision, and the results differ between the two frameworks. While I’m aware that such mismatches can occur for float32, should I also expect similar discrepancies when working with float16 (maybe because intermediate ops are computed in float32) ? If so, what are the potential causes, and how can I resolve or minimize these differences?
Any insights or guidance on this matter would be greatly appreciated!
Describe the issue
I’ve noticed mismatches between the outputs of a PyTorch model and the corresponding ONNX model when running inference with ONNX Runtime. Specifically, I’m working with float16 precision, and the results differ between the two frameworks. While I’m aware that such mismatches can occur for float32, should I also expect similar discrepancies when working with float16 (maybe because intermediate ops are computed in float32) ? If so, what are the potential causes, and how can I resolve or minimize these differences?
Any insights or guidance on this matter would be greatly appreciated!
To reproduce
Urgency
No
Platform
Linux
OS Version
Ubuntu 22.04.3 LTS
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.21.0
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: