|
--- |
|
title: README |
|
emoji: π |
|
colorFrom: pink |
|
colorTo: yellow |
|
sdk: static |
|
pinned: false |
|
--- |
|
|
|
# Kalam Technology β Arabic Speech Recognition |
|
|
|
|
|
**Kalam Technology** is a Swedish startup pioneering Arabic speech recognition solutions. As the first company in Sweden solely dedicated to Arabic language technologies, we aim to bridge the gap in AI-driven speech applications for Arabic speakers worldwide. |
|
|
|
## π About Us |
|
|
|
Founded in Linkoping, Sweden, Kalam Technology specializes in developing state-of-the-art Arabic speech recognition systems. Our mission is to empower Arabic-speaking communities by providing accurate and efficient speech-to-text solutions, catering to various dialects and use cases. |
|
|
|
## π§ Our Approach |
|
|
|
Arabic presents unique challenges for speech recognition due to its rich morphology, diverse dialects, and the use of an abjad writing system. To address these, we employ advanced transformer-based models and deep learning techniques: |
|
|
|
* **Transformer Models**: Utilizing architectures like Wav2Vec 2.0 and HuBERT for robust feature extraction and recognition. |
|
* **Dialect Handling**: Training on diverse datasets to accommodate dialectal variations, including Egyptian, Levantine, Gulf, and Maghrebi Arabic. |
|
* **Data Augmentation**: Implementing techniques such as TimeMasking and SpecAugmentation to enhance model generalization. |
|
|
|
## π Features |
|
|
|
* **High Accuracy**: Achieving competitive Word Error Rates (WER) on benchmarks like Common Voice Arabic. |
|
* **Real-Time Transcription**: Providing low-latency speech-to-text conversion suitable for live applications. |
|
* **Dialect Identification**: Automatically detecting and adapting to various Arabic dialects for improved accuracy. |
|
* **Emotion Recognition**: Integrating emotion detection capabilities for more nuanced understanding. |
|
|
|
## π Performance |
|
|
|
Our models have demonstrated significant improvements in transcription accuracy, with recent implementations showing over 80% enhancement compared to baseline systems. This advancement positions our solutions ahead of many existing offerings in the market. |
|
|
|
## π οΈ Getting Started |
|
|
|
To utilize our Arabic speech recognition models: |
|
|
|
1. **Installation**: |
|
|
|
```bash |
|
pip install transformers |
|
``` |
|
|
|
2. **Usage**: |
|
|
|
```python |
|
# Load model directly |
|
from transformers import AutoModel |
|
model = AutoModel.from_pretrained("KalamTech/whisper-small-ar-cv-11") |
|
``` |
|
|
|
|
|
## π Datasets |
|
We train our models on a combination of publicly available and proprietary datasets, including: |
|
|
|
Common Voice Arabic: A multilingual dataset with diverse Arabic speech samples. |
|
|
|
ADI-5: Contains recordings from various Arabic dialects. |
|
|
|
MGB-3: Features Egyptian Arabic speech from diverse sources. |
|
|
|
π€ Collaborations |
|
We actively seek partnerships with academic institutions and industry leaders to further research and development in Arabic speech technologies. If you're interested in collaborating, please reach out to us. |
|
|
|
π« Contact: |
|
|
|
Email: [email protected] |
|
|
|
Website: https://kalam.se |
|
|
|
|
|
*Empowering Arabic communication through cutting-edge speech recognition.* |
|
|
|
|