An Approach to Speech Emotion Classification  Using k-NN and SVMs

SIVALINGAM Disne

Articles

Vol. 8 No. 3 (2021)

An Approach to Speech Emotion Classification Using k-NN and SVMs

SIVALINGAM Disne

PDF

Submitted: February 17, 2024
Published: 2024-02-17

Abstract

The interaction between humans and machines has become an issue of concern in recent years. Besides facial ex-pressions or gestures, speech has been evidenced as one of the foremost promising modalities for automatic emotion recognition. Effective computing means to support HCI (Human-Computer Interaction) at a psychological level, al-lowing PCs to adjust their reactions as per human requirements. Therefore, the recognition of emotion is pivotal in High-level interactions. Each Emotion has distinctive properties that form us to recognize them. The acoustic signal produced for identical expression or sentence changes is essentially a direct result of biophysical changes, (for example, the stress instigated narrowing of the larynx) set off by emotions. This connection between acoustic cues and emotions made Speech Emotion Recognition one of the moving subjects of the emotive computing area. The most motivation behind a Speech Emotion Recognition algorithm is to observe the emotional condition of a speaker from recorded Speech signals. The results from the application of k-NN and OVA-SVM for MFCC features without and with a feature selection approach are presented in this research. The MFCC features from the audio signal were initially extracted to characterize the properties of emotional speech. Secondly, nine basic statistical measures were calculated from MFCC and 117-dimensional features were consequently obtained to train the classifiers for seven different classes (Anger, Happiness, Disgust, Fear, Sadness, Disgust, Boredom and Neutral) of emotions. Next, Classification was done in four steps. First, all the 117-features are classified using both classifiers. Second, the best classifier was found and then features were scaled to [-1, 1] and classified. In the third step, the with or without feature scaling which gives better performance was derived from the results of the second step and the classification was done for each of the basic sta-tistical measures separately. Finally, in the fourth step, the combination of statistical measures which gives better per-formance was derived using the forward feature selection method Experiments were carried out using k-NN with different k values and a linear OVA-based SVM classifier with different optimal values. Berlin emotional speech da-tabase for the German language was utilized for testing the planned methodology and recognition rates as high as 60% accomplished for the recognition of emotion from voice signal for the set of statistical measures (median, maximum, mean, Inter-quartile range, skewness). OVA-SVM performs better than k-NN and the use of the feature selection technique gives a high rate.

Downloads

Download data is not yet available.

Keywords

Mel Frequency Cepstral Coefficients (MFCC)
Fast Fourier Transformation (FFT)
Discrete Cosine Transformation (DCT)
k Nearest Neighbors (k-NN)
Support Vector Machine (SVM)
One-Vs-All (OVA)

How to Cite

Disne , S. (2024). An Approach to Speech Emotion Classification Using k-NN and SVMs. Instrumentation, 8(3). Retrieved from https://editorial.instrumentationjournal.com/index.php/instr/article/view/150

This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

Downloads

Similar Articles