-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KLD calculation #1
Comments
Pretty sure he multiplies the KLD by .1 because that is his KLD weight hyperparameter |
Also, while working on a VAE that I wrote based on this, if I change the offending mean to a sum, my recon loss is much higher (2x), while my kl_loss starts as similar, but decreases faster with torch.mean; and all my reconstructed images are basically the same blurry image. I have no idea why this would change the reconstruction so much... I will have to do some investigation. |
Mean is equivalent to sum, it's just a scalar difference. Normally, using adam, if we don't have a composite loss, this scale doesn't matter, so if I change the sum to mean, the program should backpropagate the same. I changed the mean to a sum and decreased the kld weight, which fixed my problem. Basically when I changed the mean to a sum, I put too much weight on the kl weight and caused the latent distributions to be too strongly bound to the normal guassian. |
Hi,
I think there's an error in your KLD calculation.
This is what you wrote:
Instead of (what I think it should be)
Let me know if I'm right.
Also, could you explain me why you multiply KLD by 0.1 ?
Is that same as multiply BCE a big number? say 1000 for eg?
The text was updated successfully, but these errors were encountered: