Skip to content

Solution for the Loop Q Prize 2022: A speech emotion recognition DL model

Notifications You must be signed in to change notification settings

Dundalia/LoopQPrize_2022

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 

Repository files navigation

LoopQPrize_2022

Speech emotion recognition continues to be a difficult task. There are still several open problems: which are the best input features, and which is the most effective neural architecture. I have adopted a combination of input features, that include Mel spectrogram, Mel-frequency cepstral coefficients (MFCCs), chromagram, spectral contrast and Tonnetz representation. I propose an architecture based on bidirectional long-short term memory (LSTM) layers, that fully exploit the temporal information of audio recordings. I have trained the network on audio files from four different origins: Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), Crowd-sources Emotional Multimodal Actors Dataset (CREMA), Surrey Audio-Visual Expressed Emotion (SAVEE), Toronto emotional speech set (TESS).

About

Solution for the Loop Q Prize 2022: A speech emotion recognition DL model

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published