Automatic Speech Emotion Recognition Using Recurrent Neural Networks with Local Attention

Automatic emotion recognition from speech is a challenging task which significantly relies on the emotional relevance of specific features extracted from the speech signal. In this study, our goal is to use deep learning to automatically discover emotionally relevant features. It is shown that using a deep Recurrent Neural Network (RNN), we can learn both the short-time frame-level acoustic features that are emotionally relevant, as well as an appropriate temporal aggregation of those features into a compact sentence-level representation. Moreover, we propose a novel strategy for feature pooling over time using attention mechanism with the RNN, which is able to focus on local regions of a speech signal that are more emotionally salient. The proposed solution was tested on the IEMOCAP emotion corpus, and was shown to provide more accurate predictions compared to existing emotion recognition algorithms.

See more on this video at https://www.microsoft.com/en-us/research/video/automatic-speech-emotion-recognition-using-recurrent-neural-networks-local-attention/

THE FUTURE IS HERE

AI Now

Norwich U. Military Writer's Symposium – DARPA and Unimagined Technologies 10/12/2022

Inside the Mind of a Soldier: DARPA's Brain Chip Revolutionizing Warfare

How AI Can End Bias in Recruiting

Gender bias in AI: Speaking to Sophia

How generative AI can address the theory-practice gap | Derek Dubois | TEDxURI

Why AI Will Spark Exponential Economic Growth | Cathie Wood | TED

The Best AI Finance Tool Experts Will Ever Need (Free File)

Examining AI Functions In Finance – How Does This Tech Help You Make Better Investment Decisions?

Ethics in the Age of AI | Davos 2024 | World Economic Forum

AI and the Paradox of Self-Replacing Workers | Madison Mohns | TED