Vishal-Padia commited on
Commit
961ca22
·
verified ·
1 Parent(s): 998f276

Upload speech emotion recognition model

Browse files
Files changed (1) hide show
  1. README.md +67 -0
README.md ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # SentimentSound
2
+
3
+ ## Overview
4
+ This is a deep learning model for Speech Emotion Recognition that can classify audio clips into different emotional states. The model is trained on a dataset of speech samples and can identify emotions such as neutral, calm, happy, sad, angry, fearful, disgust, and surprised.
5
+
6
+ ## Model Details
7
+ - **Model Type:** Hybrid Neural Network (CNN + LSTM)
8
+ - **Input:** Audio features extracted from 3-second wav files
9
+ - **Output:** Emotion classification
10
+
11
+ ### Supported Emotions
12
+ - Neutral
13
+ - Calm
14
+ - Happy
15
+ - Sad
16
+ - Angry
17
+ - Fearful
18
+ - Disgust
19
+ - Surprised
20
+
21
+ ## Installation
22
+
23
+ ### Clone the Repository
24
+ ```bash
25
+ git clone https://github.com/Vishal-Padia/SentimentSound.git
26
+ ```
27
+
28
+ ### Dependencies
29
+ ```bash
30
+ pip install -r requirements.txt
31
+ ```
32
+
33
+ ### Usage Example
34
+
35
+ ```bash
36
+ python emotion_predictor.py
37
+ ```
38
+
39
+
40
+ ## Model Performance
41
+ - **Accuracy:** 85%
42
+ - **Evaluation Metrics:** Confusion matrix below
43
+
44
+ ![Image](confusion_matrix.png)
45
+
46
+ ## Training Details
47
+ - **Feature Extraction:**
48
+ - MFCC
49
+ - Spectral Centroid
50
+ - Chroma Features
51
+ - Spectral Contrast
52
+ - Zero Crossing Rate
53
+ - Spectral Rolloff
54
+ - **Augmentation:** Random noise and scaling applied
55
+ - **Training Techniques:**
56
+ - Class weighted loss
57
+ - AdamW optimizer
58
+ - Learning rate scheduling
59
+ - Gradient clipping
60
+
61
+ ## Limitations
62
+ - Works best with clear speech recordings
63
+ - Optimized for 3-second audio clips
64
+ - Performance may vary with different audio sources
65
+
66
+ ## Acknowledgments
67
+ - Dataset used for training (https://www.kaggle.com/datasets/uwrfkaggler/ravdess-emotional-speech-audio)