Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broadcast zero point vector when converting batched matmul to non-batched #572

Merged
merged 1 commit into from
Feb 3, 2025

Conversation

robertknight
Copy link
Owner

When a batched matmul [A, M, K] x [K, N] is converted to a non-batched matmul with LHS shape [A * M, K] the zero point needs to be broadcast to match the new row count.

This fixes an error when running the Segment Anything demo with a quantized image encoder.

@robertknight robertknight force-pushed the broadcast-matmul-zero-point branch 2 times, most recently from f90c865 to ad04fbd Compare February 3, 2025 06:41
When a batched matmul `[A, M, K] x [K, N]` is converted to a non-batched matmul
with LHS shape `[A * M, K]` the zero point needs to be broadcast to match the
new row count.

This fixes an error when running the Segment Anything demo with a quantized
image encoder.
@robertknight robertknight force-pushed the broadcast-matmul-zero-point branch from ad04fbd to 81e2328 Compare February 3, 2025 06:46
@robertknight robertknight changed the title Adjust zero point when converting batched matmul to non-batched Broadcast zero point vector when converting batched matmul to non-batched Feb 3, 2025
@robertknight robertknight merged commit b5a6d1b into main Feb 3, 2025
2 checks passed
@robertknight robertknight deleted the broadcast-matmul-zero-point branch February 3, 2025 06:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant