Skip to content

Slothologist/SpeechRecPipeLine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SpeechRecPipeLine

Pipeline for speech recognition, covering everything from sound source localisation to natural language processing

Disclaimer

This Speech recognition pipeline is in development and currently not recommended for use. Requirements will most likely change in the near future; We will move away from jackaudio as sound framework to a more dedicated framework called esiaf (see https://github.com/Slothologist/esiaf_ros), which is currently beeing worked on.

Principles

  • All audio is transmitted between most stages via Jack audio (TODO: find a way to reliably match ssl to recognized speech)
  • Additional information is transmitted using ROS
  • Highly modular design to ensure maximum reusability and flexibility

Stages

  1. Recording

    • no special software available yet, can be done by jackaudio itself
    • (will in the future be done by an dedicated esiaf node)
  2. Sound Source Localisation, Separation, Filtering

    • will most likely be based on ODAS, but no implementation yet
  3. Segmentation

  4. Speech recognition

  5. Natural Language Preprocessing

    • no software available yet

Dependencies

Required

Recommended

Optional

Certain parts of the pipeline can have different/ additional requirements. Be sure to check there as well!

About

Pipeline for speech recognition

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published