Update README.md
Browse files
README.md
CHANGED
@@ -7,4 +7,71 @@ sdk: static
|
|
7 |
pinned: false
|
8 |
---
|
9 |
|
10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
pinned: false
|
8 |
---
|
9 |
|
10 |
+
# Kalam Technology β Arabic Speech Recognition
|
11 |
+
|
12 |
+
|
13 |
+
**Kalam Technology** is a Swedish startup pioneering Arabic speech recognition solutions. As the first company in Sweden solely dedicated to Arabic language technologies, we aim to bridge the gap in AI-driven speech applications for Arabic speakers worldwide.
|
14 |
+
|
15 |
+
## π About Us
|
16 |
+
|
17 |
+
Founded in Linkoping, Sweden, Kalam Technology specializes in developing state-of-the-art Arabic speech recognition systems. Our mission is to empower Arabic-speaking communities by providing accurate and efficient speech-to-text solutions, catering to various dialects and use cases.
|
18 |
+
|
19 |
+
## π§ Our Approach
|
20 |
+
|
21 |
+
Arabic presents unique challenges for speech recognition due to its rich morphology, diverse dialects, and the use of an abjad writing system. To address these, we employ advanced transformer-based models and deep learning techniques:
|
22 |
+
|
23 |
+
* **Transformer Models**: Utilizing architectures like Wav2Vec 2.0 and HuBERT for robust feature extraction and recognition.
|
24 |
+
* **Dialect Handling**: Training on diverse datasets to accommodate dialectal variations, including Egyptian, Levantine, Gulf, and Maghrebi Arabic.
|
25 |
+
* **Data Augmentation**: Implementing techniques such as TimeMasking and SpecAugmentation to enhance model generalization.
|
26 |
+
|
27 |
+
## π Features
|
28 |
+
|
29 |
+
* **High Accuracy**: Achieving competitive Word Error Rates (WER) on benchmarks like Common Voice Arabic.
|
30 |
+
* **Real-Time Transcription**: Providing low-latency speech-to-text conversion suitable for live applications.
|
31 |
+
* **Dialect Identification**: Automatically detecting and adapting to various Arabic dialects for improved accuracy.
|
32 |
+
* **Emotion Recognition**: Integrating emotion detection capabilities for more nuanced understanding.
|
33 |
+
|
34 |
+
## π Performance
|
35 |
+
|
36 |
+
Our models have demonstrated significant improvements in transcription accuracy, with recent implementations showing over 80% enhancement compared to baseline systems. This advancement positions our solutions ahead of many existing offerings in the market.
|
37 |
+
|
38 |
+
## π οΈ Getting Started
|
39 |
+
|
40 |
+
To utilize our Arabic speech recognition models:
|
41 |
+
|
42 |
+
1. **Installation**:
|
43 |
+
|
44 |
+
```bash
|
45 |
+
pip install transformers
|
46 |
+
```
|
47 |
+
|
48 |
+
2. **Usage**:
|
49 |
+
|
50 |
+
```python
|
51 |
+
# Load model directly
|
52 |
+
from transformers import AutoModel
|
53 |
+
model = AutoModel.from_pretrained("KalamTech/whisper-small-ar-cv-11")
|
54 |
+
```
|
55 |
+
|
56 |
+
|
57 |
+
## π Datasets
|
58 |
+
We train our models on a combination of publicly available and proprietary datasets, including:
|
59 |
+
|
60 |
+
Common Voice Arabic: A multilingual dataset with diverse Arabic speech samples.
|
61 |
+
|
62 |
+
ADI-5: Contains recordings from various Arabic dialects.
|
63 |
+
|
64 |
+
MGB-3: Features Egyptian Arabic speech from diverse sources.
|
65 |
+
|
66 |
+
π€ Collaborations
|
67 |
+
We actively seek partnerships with academic institutions and industry leaders to further research and development in Arabic speech technologies. If you're interested in collaborating, please reach out to us.
|
68 |
+
|
69 |
+
π« Contact:
|
70 |
+
|
71 |
+
Email: [email protected]
|
72 |
+
|
73 |
+
Website: https://kalam.se
|
74 |
+
|
75 |
+
|
76 |
+
*Empowering Arabic communication through cutting-edge speech recognition.*
|
77 |
+
|