You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been trying to understand how ScoreCAM work, and for that purpose I tried to train a classifier on data with images of cats and dogs. I more or less borrowed a simple NN architecture from this post. Architecture is quite simple but it gives around 90/83 % train/test accuracy. When it came to the heatmaps stage, however, I ran into problems.
The problem basically is that activations turn more and more negative as training goes, and then the final ReLU of the algorithm turns the heatmap into an array of zeros.
If I load VGG model pertained on Imagenet, the same images of cats and dogs work perfectly (heatmaps look good and correct). This makes me think that the actual problem is in my network and not in the method, but I still want to ask an expert opinion. What do you think can lead to zero-like heatmaps ? Training goes well, accuracy is rising and the losses are dropping. I believe there is enough data to train such small network.
The architecture contains ReLU at almost every step. I normalized the data to 1 and then standartized RGB channels separately (like they did in the original ScoreCAM paper).
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hello,
I have been trying to understand how ScoreCAM work, and for that purpose I tried to train a classifier on data with images of cats and dogs. I more or less borrowed a simple NN architecture from this post. Architecture is quite simple but it gives around 90/83 % train/test accuracy. When it came to the heatmaps stage, however, I ran into problems.
The problem basically is that activations turn more and more negative as training goes, and then the final ReLU of the algorithm turns the heatmap into an array of zeros.
If I load VGG model pertained on Imagenet, the same images of cats and dogs work perfectly (heatmaps look good and correct). This makes me think that the actual problem is in my network and not in the method, but I still want to ask an expert opinion. What do you think can lead to zero-like heatmaps ? Training goes well, accuracy is rising and the losses are dropping. I believe there is enough data to train such small network.
The architecture contains ReLU at almost every step. I normalized the data to 1 and then standartized RGB channels separately (like they did in the original ScoreCAM paper).
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions