File size: 5,124 Bytes
52d3cd2
 
 
 
 
05ce87a
 
 
 
 
 
 
 
 
 
0608b0d
 
 
 
05ce87a
 
 
52d3cd2
0608b0d
 
 
 
 
 
 
 
 
 
 
 
05ce87a
0608b0d
 
05ce87a
0608b0d
 
 
52d3cd2
 
05ce87a
52d3cd2
05ce87a
52d3cd2
05ce87a
52d3cd2
05ce87a
52d3cd2
05ce87a
 
 
 
52d3cd2
05ce87a
52d3cd2
05ce87a
52d3cd2
05ce87a
 
 
 
52d3cd2
05ce87a
52d3cd2
05ce87a
52d3cd2
05ce87a
52d3cd2
05ce87a
 
 
 
 
 
 
 
52d3cd2
05ce87a
52d3cd2
 
 
 
 
 
 
 
05ce87a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52d3cd2
05ce87a
 
 
 
 
52d3cd2
05ce87a
52d3cd2
 
05ce87a
52d3cd2
05ce87a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
---
library_name: peft
license: apache-2.0
base_model: openai/whisper-large-v2
tags:
- automatic-speech-recognition
- whisper
- asr
- songhoy
- hsn
- Mali
- MALIBA-AI
- lora
- fine-tuned
- code-switching
- african-language
language:
- hsn
- fr
language_bcp47:
- hsn-ML
- fr-ML
model-index:
- name: songhoy-asr-v1
  results:
  - task:
      name: Automatic Speech Recognition
      type: automatic-speech-recognition
    dataset:
      name: songhoy-asr
      type: custom
      split: test
      args:
        language: hsn
    metrics:
    - name: WER
      type: wer
      value: 16.58
    - name: CER
      type: cer
      value: 4.63
pipeline_tag: automatic-speech-recognition
---

# Songhoy-ASR-v1: First Open-Source Speech Recognition Model for Songhoy

Songhoy-ASR-v1 represents a historic milestone as the **first open-source speech recognition model** for Songhoy, a language spoken by over 3 million people across Mali, Niger, and Burkina Faso. Developed as part of the MALIBA-AI initiative, this groundbreaking model not only achieves impressive accuracy but opens the door to speech technology for Songhoy speakers for the very first time.

## Model Overview

This model demonstrates exceptional performance for Songhoy speech recognition, with particularly strong capabilities in:

- **Pure Songhoy recognition**: Accurate transcription of traditional and contemporary Songhoy speech
- **Code-switching handling**: Effectively manages the natural mixing of Songhoy with French
- **Dialect adaptation**: Works across regional variations of Songhoy
- **Noise resilience**: Maintains accuracy even with moderate background noise

## Impressive Performance Metrics

Songhoy-ASR-v1 achieves breakthrough results on our test dataset:

| Metric | Value | 
|--------|-------|
| Word Error Rate (WER) | 16.58% |
| Character Error Rate (CER) | 4.63% |

These results represent the best publicly available performance for Songhoy speech recognition, making this model suitable for production applications.

## Technical Details

The model is a fine-tuned version of OpenAI's Whisper-large-v2, adapted specifically for Songhoy using LoRA (Low-Rank Adaptation). This efficient fine-tuning approach allowed us to achieve excellent results while maintaining the multilingual capabilities of the base model.

### Training Information
- **Base Model**: openai/whisper-large-v2
- **Fine-tuning Method**: LoRA (Parameter-Efficient Fine-Tuning)
- **Training Dataset**: [coming soon]
- **Training Duration**: 4 epochs
- **Batch Size**: 32 (8 per device with gradient accumulation steps of 4)
- **Learning Rate**: 0.001 with linear scheduler and 50 warmup steps
- **Mixed Precision**: Native AMP

### Training Results

| Training Loss | Epoch  | Step | Validation Loss |
|:-------------:|:------:|:----:|:---------------:|
| 0.3661        | 1.0    | 245  | 0.3118          |
| 0.2712        | 2.0    | 490  | 0.2215          |
| 0.2008        | 3.0    | 735  | 0.2011          |
| 0.1518        | 3.9857 | 976  | 0.1897          |

## Real-World Applications

Songhoy-ASR-v1 enables numerous applications previously unavailable to Songhoy speakers:

- **Media Transcription**: Automatic subtitling of Songhoy content
- **Voice Interfaces**: Voice-controlled applications in Songhoy
- **Educational Tools**: Language learning and literacy applications
- **Cultural Preservation**: Documentation of oral histories and traditions
- **Healthcare Communication**: Improved access to health information
- **Accessibility Solutions**: Tools for the hearing impaired

## Usage Examples

```
  Coming soon
```

## Limitations

[Coming Soon]
<!-- 
- Performance varies with different regional dialects of Songhoy
- Very specific technical terminology may have lower accuracy
- Extreme background noise can impact transcription quality
- Very young speakers or non-native speakers may have reduced accuracy
- Limited performance with extremely low-quality audio recordings -->

## Part of MALIBA-AI's African Language Initiative

Songhoy-ASR-v1 is part of MALIBA-AI's commitment to developing speech technology for all Malian languages. This model represents a significant step toward digital inclusion for Songhoy speakers and demonstrates the potential for high-quality AI systems for African languages.

Our mission of "No Malian Language Left Behind" drives us to develop technologies that:
- Preserve linguistic diversity
- Enable access to digital tools regardless of language
- Support local innovation and content creation
- Bridge the digital divide for all Malians

## Framework Versions
- PEFT 0.14.1.dev0
- Transformers 4.50.0.dev0
- PyTorch 2.5.1+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0

## License

This model is released under the Apache 2.0 license.

## Citation

```bibtex
@misc{songhoy-asr-v1,
  author = {MALIBA-AI},
  title = {Songhoy-ASR-v1: Speech Recognition for Songhoy},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/MALIBA-AI/songhoy-asr-v1}}
}
```

---

**MALIBA-AI: Empowering Mali's Future Through Community-Driven AI Innovation**

*"No Malian Language Left Behind"*