Amphion VALL-E new version release #220

jiaqili3 · 2024-06-17T09:48:52Z

✨ Description

In this PR, we release an unofficial PyTorch implementation of VALL-E, a zero-shot voice cloning model via neural codec language modeling. If trained properly, this model could match the performance specified in the original paper.
This is a refined version compared to the first version of VALLE in Amphion, we have changed the underlying implementation to Llama to provide better model performance, faster training speed, and more readable codes.
This can be a great tool for users who want to learn speech language models and its implementation.

🚧 Related Issues

None

👨‍💻 Changes Proposed

- We have changed the underlying implementation to Llama to provide better model performance, faster training speed, and more readable codes.
- We provide more detailed README.md for reproducing our models with pretrained weights, training on LibriTTS, and future plans on improving the model.
- We use a refined codec model name SpeechTokenizer as the codec, yielding better modeling quality than the original Encodec

🧑‍🤝‍🧑 Who Can Review?

@HeCheng0625 @RMSnow @HarryHe11 @zhizhengwu

✅ Checklist

- Code has been reviewed
- Code complies with the project's code standards and best practices
- Code has passed all tests
- Code does not affect the normal use of existing features
- Code has been commented properly
- Documentation has been updated (if applicable)
- Demo/checkpoint has been attached (if applicable)

RMSnow

Thanks Jiaqi for the great efforts! I found a lot of codes are borrowed from speechtokenizer. You can add an acknowledge in the main readme.

README.md

bins/tts/train.py

egs/tts/valle_v2/README.md

egs/tts/valle_v2/train_ar_libritts.sh

models/codec/speechtokenizer/quantization/core_vq.py

models/codec/speechtokenizer/quantization/distrib.py

models/codec/speechtokenizer/quantization/vq.py

models/tts/valle_v2/modeling_llama.py

models/tts/valle_v2/run_infer.py

RMSnow · 2024-06-17T20:47:57Z

Do we have any pretrained models or demo for this new valle?

…VALLEv2 into release_vallev2

jiaqili3 · 2024-06-18T02:39:41Z

Do we have any pretrained models or demo for this new valle?

It has been detailed in the readme file in egs/tts/valle_v2, and the demo.ipynb has also been uploaded to run inference with pretrained weights

jiaqili3 · 2024-06-19T08:46:56Z

Hi @RMSnow , thanks for your review! I've updated the code and your previous review questions have been resolved.

bins/tts/train.py

egs/tts/valle_v2/demo.ipynb

RMSnow · 2024-06-20T03:18:39Z

Hi @jiaqili3, please update the demo.ipynb. Others look good to me.

…VALLEv2 into release_vallev2

jiaqili3 · 2024-06-21T12:34:20Z

Updated. Thanks @RMSnow

RMSnow

LGTM

jiaqili3 and others added 5 commits June 17, 2024 17:45

add new valle release files

5bc6553

add reference to original paper link

20cee95

reformat train.py using black

122754c

Update dataset information in README.md

7c26c87

Update news in README.md

593f06c

RMSnow requested changes Jun 17, 2024

View reviewed changes

jiaqili3 and others added 4 commits June 18, 2024 10:33

update acknowledgement in README.md

6b63716

add demo ipynb

45f5450

Merge branch 'release_vallev2' of https://github.com/jiaqili3/Amphion…

1b2c31c

…VALLEv2 into release_vallev2

remove unused files

1a46052

jiaqili3 added 7 commits June 18, 2024 10:47

remove useless codes in speech tokenizer

348de85

update copyright info in tokenizer file

e69f578

add copyright info

d6e003a

update model naming consistency

09c599a

update folder naming consistency for VALLE_V2

2c38b42

specifiy transformers version in the main env.sh as well.

e8b6a62

name VALLE to VALL-E in readme files

3899c6f

RMSnow self-requested a review June 20, 2024 03:05

Update README.md

670db6f

RMSnow requested changes Jun 20, 2024

View reviewed changes

bins/tts/train.py Show resolved Hide resolved

egs/tts/valle_v2/demo.ipynb Outdated Show resolved Hide resolved

jiaqili3 added 5 commits June 21, 2024 20:29

update demo.ipynb

1c16ebc

add copyrights to tokenizer codes

b8de0ed

update tokenizer imports

137d481

Merge branch 'release_vallev2' of https://github.com/jiaqili3/Amphion…

3f50dc8

…VALLEv2 into release_vallev2

format codes using black

761c566

RMSnow self-requested a review June 21, 2024 18:00

RMSnow approved these changes Jun 21, 2024

View reviewed changes

RMSnow changed the title ~~Amphion VALLE new version release~~ Amphion VALL-E new version release Jun 21, 2024

RMSnow merged commit f96a153 into open-mmlab:main Jun 21, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Amphion VALL-E new version release #220

Amphion VALL-E new version release #220

jiaqili3 commented Jun 17, 2024 •

edited

Loading

RMSnow left a comment

RMSnow commented Jun 17, 2024

jiaqili3 commented Jun 18, 2024 •

edited

Loading

jiaqili3 commented Jun 19, 2024

RMSnow commented Jun 20, 2024

jiaqili3 commented Jun 21, 2024

RMSnow left a comment

Amphion VALL-E new version release #220

Amphion VALL-E new version release #220

Conversation

jiaqili3 commented Jun 17, 2024 • edited Loading

✨ Description

🚧 Related Issues

👨‍💻 Changes Proposed

🧑‍🤝‍🧑 Who Can Review?

✅ Checklist

RMSnow left a comment

Choose a reason for hiding this comment

RMSnow commented Jun 17, 2024

jiaqili3 commented Jun 18, 2024 • edited Loading

jiaqili3 commented Jun 19, 2024

RMSnow commented Jun 20, 2024

jiaqili3 commented Jun 21, 2024

RMSnow left a comment

Choose a reason for hiding this comment

jiaqili3 commented Jun 17, 2024 •

edited

Loading

jiaqili3 commented Jun 18, 2024 •

edited

Loading