Skip to content

Speech Assessment API in FastAPI with HuggingFace 🤗

Notifications You must be signed in to change notification settings

aryanxxvii/lark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Lark API Readme

What is it?

  • Lark API is a speech assessment REST API built using FastAPI in Python.
  • It provides accuracy scores, speech to text transcription, and the projected IELTS pronunciation band.
  • It allows English learning apps and websites to assess and provide real-time feedback on the users’ pronunciation.

How does it work?

ML:

ML
  • Lark utilizes the Wav2Vec2 model from Meta for analyzing the speech sample.
  • It converts the speech to it’s phonetic transcription (S2P) using zero-shot cross-lingual recognition.
  • After recognizing the phonetics of the speech, it compares it with the ideal pronunciation of the transcribed speech using the Jaro-Winkler string similarity algorithm.

Backend:

  • The API is written completely in FastAPI with MongoDB as the database.

The Frontend part:

  • The Frontend is written using ReactJS and TailwindCSS.

ML Models used

References