You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a question regarding the architecture. What would happen if you introduced a generator network before the vocoder to generate mel spectrograms, and then trained the generator while using a pre-trained vocoder? I'm curious about how this approach might affect the performance and quality of the generated audio.
Looking forward to your thoughts on this.
The text was updated successfully, but these errors were encountered:
That's an interesting idea! It definitely sounds like it should work fairly well, but I don't have a good feeling for the tradeoffs it might have. It might be that it makes the task of the generator way easier and increases overall performance, or it might be that error accumulated by adding an additional model to the pipeline makes it worse on average. I suspect it might improve things a bit. Regardless, a very cool idea, if you try it out, do be sure to let us know how it goes!
Hi,
First of all, great work on this project!
I have a question regarding the architecture. What would happen if you introduced a generator network before the vocoder to generate mel spectrograms, and then trained the generator while using a pre-trained vocoder? I'm curious about how this approach might affect the performance and quality of the generated audio.
Looking forward to your thoughts on this.
The text was updated successfully, but these errors were encountered: