SELU weight init #11

aman-tiwari · 2017-07-24T03:15:32Z

Shouldn't the weight initialization for SELU be something like:

def selu_weights_init(m):
    classname = m.__class__.__name__
    if classname.find('Conv') != -1:
        m.weight.data.normal_(0.0, 0.5 / math.sqrt(m.weight.numel()))

    elif classname.find('BatchNorm') != -1:
        size = m.weight.size()
        fan_out = size[0] # number of rows
        fan_in = size[1] # number of columns

        m.weight.data.normal_(0.0, 1.0 / math.sqrt(fan_in))
        # Estimated mean, must be around 0
        m.bias.data.fill_(0)

(The 0.5 factor for the conv. coming from reading the PyTorch forums about what worked for someone, in other places 1.0 is used)

AlexiaJM · 2017-07-24T16:12:59Z

Thanks for raising this point! You're right, it's said in the paper on page 6 (https://arxiv.org/pdf/1706.02515.pdf) that one should use var = 1/n_weights. It's still an attracting point so not initializing correctly can still work but yes this should be modified.

If someone can try it and see if things still work well (using the 1/n_weights first, then .50/n_weights second if that doesn't work well), I'll accept a pull request. Otherwise, I'll try it eventually when I have the time and I'll change it.

aman-tiwari · 2017-07-24T21:56:55Z

I'm about to try it now. You're right that it says to use Var = 1/n_weights but in the official TF implementation they released they use Var = 1/n_neurons_in : https://github.com/bioinf-jku/SNNs/blob/master/selu.py#L31 , not sure which one will work better but hopefully will have some results to find out 🤔

AlexiaJM · 2017-07-25T19:37:26Z

Just saw this earlier today: http://cs231n.github.io/neural-networks-2/#init. It shows why use fan_in.

rfratila mentioned this issue Nov 13, 2018

SELU Incorrect weight and Bias initialization Aifred-Health/VulcanAI#85

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SELU weight init #11

SELU weight init #11

aman-tiwari commented Jul 24, 2017 •

edited

Loading

AlexiaJM commented Jul 24, 2017

aman-tiwari commented Jul 24, 2017 •

edited

Loading

AlexiaJM commented Jul 25, 2017

SELU weight init #11

SELU weight init #11

Comments

aman-tiwari commented Jul 24, 2017 • edited Loading

AlexiaJM commented Jul 24, 2017

aman-tiwari commented Jul 24, 2017 • edited Loading

AlexiaJM commented Jul 25, 2017

aman-tiwari commented Jul 24, 2017 •

edited

Loading

aman-tiwari commented Jul 24, 2017 •

edited

Loading