-
Notifications
You must be signed in to change notification settings - Fork 631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Amphion VALL-E new version release #220
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Jiaqi for the great efforts! I found a lot of codes are borrowed from speechtokenizer. You can add an acknowledge in the main readme.
Do we have any pretrained models or demo for this new valle? |
It has been detailed in the readme file in egs/tts/valle_v2, and the demo.ipynb has also been uploaded to run inference with pretrained weights |
Hi @RMSnow , thanks for your review! I've updated the code and your previous review questions have been resolved. |
Hi @jiaqili3, please update the demo.ipynb. Others look good to me. |
Updated. Thanks @RMSnow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
✨ Description
In this PR, we release an unofficial PyTorch implementation of VALL-E, a zero-shot voice cloning model via neural codec language modeling. If trained properly, this model could match the performance specified in the original paper.
This is a refined version compared to the first version of VALLE in Amphion, we have changed the underlying implementation to Llama to provide better model performance, faster training speed, and more readable codes.
This can be a great tool for users who want to learn speech language models and its implementation.
🚧 Related Issues
None
👨💻 Changes Proposed
🧑🤝🧑 Who Can Review?
@HeCheng0625 @RMSnow @HarryHe11 @zhizhengwu
✅ Checklist