Replies: 2 comments
-
Hi @moabarar! Thank you very much for your suggestions! [1] I incorporated your code into the training loop. So far it seems to work but I am still running some tests to see if I can manage to replicate the results. I will report back here once I am done with testing. It would be amazing if it works because then we wouldn't have to pre-compile the training steps for each cutoff index but only once. [2] The ema generator is initialized with different parameters but then the parameters from the training generator are copied, see here. Hence the parameters from the ema generator and from the training generator are the same at the beginning of the training. I will add a comment so that it becomes more clear. [3] Yes, I agree. I will change that. Thanks again! |
Beta Was this translation helpful? Give feedback.
-
Hi @moabarar, I just pushed the updated style mixing regularization (see eb1a5d5). Everything works great, thanks again! |
Beta Was this translation helpful? Give feedback.
-
Hi,
Thanks for sharing the training loop for StyleGan2 - it looks good and well implemented. I have couple of questions/suggestions that could improve the current training loop.
[1] In the style mixing regularization - why not use the original mixing done by StyleGan2 official repo ? I tried to port this into my code as well, and below is a jax implementation of the style mixing regularization. It seems that in the current training loop at the beginning is slow and as you said in the repo it is related to the style mixing regularization. Note that I haven't done thorough testing on the code snippets but I thought it could be useful to share it:
[2] The MEA generator parameters are initially set to different parameters than the initial generator. This is mainly because you init the generator two times differently; one time by initializing the mapping network and the synthesis network separately, and the other by initializing the Generator model. The two initializations use the same PRNGKey, so I but when using the key for the Generator Model, I think the mapping network and the synthesis network will get different RNGs because Flax will split the key between the initializations of the models. If I am not mistaken the initial MEA model should be initialized to the initial generator model.
[3] I think it would be better to use different initializations keys for the generator and the discriminator. Currently they both use the same RNG key.
Again thanks for the hard work you're putting into this.
Cheers!
Beta Was this translation helpful? Give feedback.
All reactions