Emotion Recognition

Students: Apply for this Summer 2026 research team by selecting the researcher's name

Raw Audio vs. Spectrograms: Advancing Emotion Recognition from Speech

June 1 - Aug 7 (proposed: could be flexible)

This project will investigate how machine learning models can recognize human emotions from speech. Two approaches will be examined:

  • processing raw audio signals
  • using spectrograms that convert sound into image form

Models will be developed and evaluated for both methods, providing hands-on experience in audio processing, image-based representation, and end-to-end learning.

The purpose is to identify which approach captures emotional cues more accurately and efficiently. The work will highlight how different data representations influence model performance and advance speech emotion recognition for real-world applications in education, healthcare, customer service, and human–AI interaction.

Student Learning Outcomes

Students will develop machine learning models for speech emotion recognition using raw audio and spectrograms, including data preprocessing, model training, and evaluation. The project provides hands-on experience with HPC resources, builds skills in Python, deep learning, and audio- and image-based classification, and concludes with manuscript writing and publication experience.

This work is supported by a 2026 Gonzaga Research Opportunity.