Elevenlabs Synthetic Dataset Generator

Place the generate.py and convert.py files in a directory alongside a metadata.csv file

the metadata.csv MUST contain only a single pipe per line

file_name|text content in the file

Then run the generator python script

python3 generate.py

Now you can convert the output mp3 files to wav

python3 convert.py

The script creates a venv, installs deps, activates venv, creates an output directory called output_audio, and begins populating it with the content of the csv, generating the dataset text and saving each file with the associated name.

clone this repo, and add your Elevenlabs API key and desired voice id, and test with the included numbers.csv.

The script will automatically pick up where it left off in the csv given a failure mode when you re-start the script.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md
convert.py		convert.py
generate.py		generate.py
numbers.csv		numbers.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Elevenlabs Synthetic Dataset Generator

Place the generate.py and convert.py files in a directory alongside a metadata.csv file

Then run the generator python script

Now you can convert the output mp3 files to wav

Train on this dataset using this conditioned version of piper for jetson platforms

About

Releases

Packages

Languages

robit-man/synthetic_speech_dataset_generator

Folders and files

Latest commit

History

Repository files navigation

Elevenlabs Synthetic Dataset Generator

Place the generate.py and convert.py files in a directory alongside a metadata.csv file

Then run the generator python script

Now you can convert the output mp3 files to wav

Train on this dataset using this conditioned version of piper for jetson platforms

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages