- Lark API is a speech assessment REST API built using FastAPI in Python.
- It provides accuracy scores, speech to text transcription, and the projected IELTS pronunciation band.
- It allows English learning apps and websites to assess and provide real-time feedback on the users’ pronunciation.
- Lark utilizes the Wav2Vec2 model from Meta for analyzing the speech sample.
- It converts the speech to it’s phonetic transcription (S2P) using zero-shot cross-lingual recognition.
- After recognizing the phonetics of the speech, it compares it with the ideal pronunciation of the transcribed speech using the Jaro-Winkler string similarity algorithm.
- The API is written completely in FastAPI with MongoDB as the database.
- The Frontend is written using ReactJS and TailwindCSS.