Replicating DeepMind's papers "β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework" and "Understanding disentangling in β-VAE"
Result by changing latent Z from -3.0 to 3.0 with γ=100.0 and C=20.0
Latent variables with small variances seem extracting "x", "y", "rotation" and "scale" parameters.
(This experiment is using DeepMind's dsprite data set.)
Z | Image | Parameter | Variance |
---|---|---|---|
z0 | ![]() |
0.9216 | |
z1 | ![]() |
0.9216 | |
z2 | ![]() |
Rotation | 0.0011 |
z3 | ![]() |
Rotation? | 0.0038 |
z4 | ![]() |
Pos X | 0.0002 |
z5 | ![]() |
0.9384 | |
z6 | ![]() |
Scale? | 0.0004 |
z7 | ![]() |
0.8991 | |
z8 | ![]() |
0.9483 | |
z9 | ![]() |
Pos Y | 0.0004 |
Left: original Right: reconstructed image