Top-notch Automatic speech Recognition System (TARS)
Follow the instructions as the original wavenet implementation.
However, instead of installing
scikits.audiolab
, use soundfile
instead, which supports Python 3.x
Also, when compiling Libsndfile from source / installing from brew, make sure to enable Flac support
MacOS: (Courtesy of this person)
brew install libsndfile --with-lame --with-flac --with-libvorbis
brew link --overwrite libsndfile
preprocess.py - written by someone else,used to create lmfcc from the sound files and store those the labels in asset/data/preprocess folder Conf.py - Parameters and alphabet mapping
train.py - Main file for training, run this to get the log files which you can plot in tensorboard model.py - Creates the graph for the model dataloder.py - The dataloading pipeline, loads data efficiently. Does the batching too! test_ctc.py - Some experiments, can be ignored