You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Where is AttCAT-aggregated-by-rollout implemented?
Looking through the code, I could only find the sum AttCAT method.
Mathematically speaking, AttCAT produces attributions of shape (batch, from_token) for each layer. So we cannot apply the rollout method on it at all, as rollout needs attributions with the shape (batch, to_token, from_token).
Here is the part of the paper that discusses these two methods:
The text was updated successfully, but these errors were encountered:
Where is AttCAT-aggregated-by-rollout implemented?
Looking through the code, I could only find the sum AttCAT method.
Mathematically speaking, AttCAT produces attributions of shape
(batch, from_token)
for each layer. So we cannot apply the rollout method on it at all, as rollout needs attributions with the shape(batch, to_token, from_token)
.Here is the part of the paper that discusses these two methods:

The text was updated successfully, but these errors were encountered: