Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixup Initialization #33

Open
shindavid opened this issue Jan 30, 2023 · 1 comment
Open

Fixup Initialization #33

shindavid opened this issue Jan 30, 2023 · 1 comment

Comments

@shindavid
Copy link
Owner

Used in KataGo, described here.

Basically, Fixup Initialization purportedly allows us to get rid of batch normalization layers, which leads to all sorts of advantages, described in the link.

I vaguely recall David Wu backtracking on the value of this idea when we spoke. So we should double-check with him on this.

@shindavid
Copy link
Owner Author

David Wu wrote this in an email:

...the part about fixup init is a bit outdated and is going to get updated once I publish the new architectures in a few months - fixup actually does have some significant costs on final neural net fitting quality, that I hadn't known at the time, so sticking with batch norm is probably the best approach still if you want to just get something simple working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant