-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SELU weight init #11
Comments
Thanks for raising this point! You're right, it's said in the paper on page 6 (https://arxiv.org/pdf/1706.02515.pdf) that one should use var = 1/n_weights. It's still an attracting point so not initializing correctly can still work but yes this should be modified. If someone can try it and see if things still work well (using the 1/n_weights first, then .50/n_weights second if that doesn't work well), I'll accept a pull request. Otherwise, I'll try it eventually when I have the time and I'll change it. |
I'm about to try it now. You're right that it says to use |
Just saw this earlier today: http://cs231n.github.io/neural-networks-2/#init. It shows why use fan_in. |
Shouldn't the weight initialization for SELU be something like:
(The 0.5 factor for the conv. coming from reading the PyTorch forums about what worked for someone, in other places 1.0 is used)
The text was updated successfully, but these errors were encountered: