Stylegan2 training loop #7

moabarar · 2021-09-28T18:45:38Z

moabarar
Sep 28, 2021

Hi,

Thanks for sharing the training loop for StyleGan2 - it looks good and well implemented. I have couple of questions/suggestions that could improve the current training loop.

[1] In the style mixing regularization - why not use the original mixing done by StyleGan2 official repo ? I tried to port this into my code as well, and below is a jax implementation of the style mixing regularization. It seems that in the current training loop at the beginning is slow and as you said in the repo it is related to the style mixing regularization. Note that I haven't done thorough testing on the code snippets but I thought it could be useful to share it:

def mix_latents(dlatents_in1, dlatents_in2, mixing_rng, style_mixing_prob):
  num_layers = dlatents_in1.shape[1]
  layer_idx = np.arange(dlatents_in1.shape[1])[jnp.newaxis, :, jnp.newaxis]
  cutoff_rng, layer_select_rng, mixing_rng = jax.random.split(mixing_rng, 3)
  mixing_cutoff = jax.lax.cond(jax.random.uniform(cutoff_rng, (), minval=0.0, maxval=1.0) < style_mixing_prob,
                               lambda _: jax.random.randint(layer_select_rng, (), 1, num_layers, dtype=jnp.int32),
                               lambda _: num_layers,
                               operand=None)
  mixing_cond = jnp.broadcast_to(layer_idx < mixing_cutoff, dlatents_in1.shape)
  dlatents_in = jnp.where(mixing_cond,
                          dlatents_in1,
                          dlatents_in2)
  return dlatents_in

[2] The MEA generator parameters are initially set to different parameters than the initial generator. This is mainly because you init the generator two times differently; one time by initializing the mapping network and the synthesis network separately, and the other by initializing the Generator model. The two initializations use the same PRNGKey, so I but when using the key for the Generator Model, I think the mapping network and the synthesis network will get different RNGs because Flax will split the key between the initializations of the models. If I am not mistaken the initial MEA model should be initialized to the initial generator model.

[3] I think it would be better to use different initializations keys for the generator and the discriminator. Currently they both use the same RNG key.

Again thanks for the hard work you're putting into this.
Cheers!

matthias-wright · 2021-09-29T14:07:06Z

matthias-wright
Sep 29, 2021
Maintainer

Hi @moabarar! Thank you very much for your suggestions!

[1] I incorporated your code into the training loop. So far it seems to work but I am still running some tests to see if I can manage to replicate the results. I will report back here once I am done with testing. It would be amazing if it works because then we wouldn't have to pre-compile the training steps for each cutoff index but only once.

[2] The ema generator is initialized with different parameters but then the parameters from the training generator are copied, see here. Hence the parameters from the ema generator and from the training generator are the same at the beginning of the training. I will add a comment so that it becomes more clear.

[3] Yes, I agree. I will change that.

Thanks again!

0 replies

matthias-wright · 2021-10-03T14:50:30Z

matthias-wright
Oct 3, 2021
Maintainer

Hi @moabarar, I just pushed the updated style mixing regularization (see eb1a5d5). Everything works great, thanks again!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stylegan2 training loop #7

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Stylegan2 training loop #7

moabarar Sep 28, 2021

Replies: 2 comments

matthias-wright Sep 29, 2021 Maintainer

matthias-wright Oct 3, 2021 Maintainer

moabarar
Sep 28, 2021

matthias-wright
Sep 29, 2021
Maintainer

matthias-wright
Oct 3, 2021
Maintainer