-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Text to unit training code #2
Comments
I have opted to BigVSAN - i was really impressed by it's quality, i wasn't to spot any difference from synthesized and real audio on my datasets. I have published easy to use library and you can check the evaluation notebook that requires only pytorch. https://github.com/ex3ndr/supervoice-vocoder I am in the process of training, i had to restart from scratch because of replaced vocoder, but it was impressive already and vocoder was the problem for me, now it is not. |
Looks good! |
@ex3ndr on how much hours of data you are training the new model? I am also planning to train on my own dataset if your training goes well. |
i am training on quite small dataset- libritts-r + vctk. They have only high quality voice, but i want to try to do some pre-training on much bigger one to cover many languages and phonemes and then fine-tune on higher quality one. |
Hi, the work definitely looks promising, especially since no one is for some reason really trying to figure out VoiceBox.
So, yeah, thanks for your work!
I see that you are currently working on the vocoder part of it, is the text to units part covered end-to-end in this repo, or is anything left to incorporate? Was just curious regarding that.
The text was updated successfully, but these errors were encountered: