Skip to content

Commit

Permalink
readme
Browse files Browse the repository at this point in the history
  • Loading branch information
cantabile-kwok committed Oct 7, 2023
1 parent 5d80916 commit 7c0763f
Show file tree
Hide file tree
Showing 2 changed files with 30 additions and 2 deletions.
32 changes: 30 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,30 @@
# VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching
> This is the official implementation of [VoiceFlow](https://arxiv.org/abs/2309.05027).
# \[Working in Progress\] VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching
> This is the official implementation of [VoiceFlow](https://arxiv.org/abs/2309.05027).
![traj](resources/traj.png)

## Environment Setup

## Data Preparation

## Training

## Inference

## Acknowledgement
During the development, the following repositories were referred to:
* [Kaldi](https://github.com/kaldi-asr/kaldi), for most utility scripts in `utils/`.
* [GradTTS](https://github.com/huawei-noah/Speech-Backbones/tree/main/Grad-TTS), where most of the model architecture and training pipelines are adopted.
* [VITS](https://github.com/jaywalnut310/vits), whose distributed bucket sampler is used.
* [CFM](https://github.com/atong01/conditional-flow-matching), for the ODE samplers.
## Citation
```
@misc{guo2023voiceflow,
title={VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching},
author={Yiwei Guo and Chenpeng Du and Ziyang Ma and Xie Chen and Kai Yu},
year={2023},
eprint={2309.05027},
archivePrefix={arXiv},
primaryClass={eess.AS}
}
```
Binary file added resources/traj.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 7c0763f

Please sign in to comment.