Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction

This is the unofficial implementation of Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction.

Training:

First prepare data: You need to convert the VCTK dataset to WAV format and truncate the head and tail silence segments. You can refer to the file pres/data_pre.py

To train this model, run this code:

python train.py --gpu_avail [gpu_ids] --batch_size [batch] --init_lr [initial learning rate] --epochs [number of epochs of training] or --steps [number of steps where training stops] --data_dir [dir of VCTK dataset]

or you may change default options in train.py

To run inference or evaluation steps，you can refer to the file evaluation.py

Notes:

refercode folder contains reference code from https://github.com/jik876/hifi-gan & https://github.com/facebookresearch/ConvNeXt & https://github.com/huyanxin/phasen
runs folder contains tensorboard logs
AudioSamples folder contains samples for comparison, including spectra

Citations:

@article{lu2024towards,
  title={Towards high-quality and efficient speech bandwidth extension with parallel amplitude and phase prediction},
  author={Lu, Ye-Xin and Ai, Yang and Du, Hui-Peng and Ling, Zhen-Hua},
  journal={arXiv preprint arXiv:2401.06387},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
AudioSamples		AudioSamples
__pycache__		__pycache__
models		models
pres		pres
refercode		refercode
runs		runs
utils		utils
.gitignore		.gitignore
README.md		README.md
apbwe.png		apbwe.png
dataloader.py		dataloader.py
evaluation.py		evaluation.py
requirements.txt		requirements.txt
temp.sh		temp.sh
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction

Training:

Notes:

Citations:

About

Releases

Packages

Languages

xinan-chen/AP_BWE

Folders and files

Latest commit

History

Repository files navigation

Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction

Training:

Notes:

Citations:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages