Skip to content

PyTorch toolkit for streaming speech recognition, speech translation and simultaneous translation based on fairseq.

Notifications You must be signed in to change notification settings

George0828Zhang/simulst

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Simultaneous Speech Translation

Code base for simultaneous speech translation experiments. It is based on fairseq.

Implemented

Encoder

Streaming Models

Setup

  1. Install fairseq
git clone https://github.com/pytorch/fairseq.git
cd fairseq
git checkout 4a7835b
python setup.py build_ext --inplace
pip install .
  1. (Optional) Install apex for faster mixed precision (fp16) training.
  2. Install dependencies
pip install -r requirements.txt
  1. Update submodules
git submodule update --init --recursive

Pre-trained model

ASR model with Emformer encoder and Transformer decoder. Pre-trained with joint CTC cross-entropy loss.

MuST-C (WER) en-de (V2) en-es
dev 9.65 14.44
tst-COMMON 12.85 14.02
model download download
vocab download download

Sequence-level Knowledge Distillation

MuST-C (BLEU) en-de (V2)
valid 31.76
distillation download
vocab download

Citation

Please consider citing our paper:

@inproceedings{chang22f_interspeech,
  author={Chih-Chiang Chang and Hung-yi Lee},
  title={{Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech Translation}},
  year=2022,
  booktitle={Proc. Interspeech 2022},
  pages={5175--5179},
  doi={10.21437/Interspeech.2022-10627}
}

About

PyTorch toolkit for streaming speech recognition, speech translation and simultaneous translation based on fairseq.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published