AI & ML interests
Speech Recognition, LLMs
Recent Activity
Kalam Technology ā Arabic Speech Recognition
Kalam Technology is a Swedish startup pioneering Arabic speech recognition solutions. As the first company in Sweden solely dedicated to Arabic language technologies, we aim to bridge the gap in AI-driven speech applications for Arabic speakers worldwide.
š About Us
Founded in Linkoping, Sweden, Kalam Technology specializes in developing state-of-the-art Arabic speech recognition systems. Our mission is to empower Arabic-speaking communities by providing accurate and efficient speech-to-text solutions, catering to various dialects and use cases.
š§ Our Approach
Arabic presents unique challenges for speech recognition due to its rich morphology, diverse dialects, and the use of an abjad writing system. To address these, we employ advanced transformer-based models and deep learning techniques:
- Transformer Models: Utilizing architectures like Wav2Vec 2.0 and HuBERT for robust feature extraction and recognition.
- Dialect Handling: Training on diverse datasets to accommodate dialectal variations, including Egyptian, Levantine, Gulf, and Maghrebi Arabic.
- Data Augmentation: Implementing techniques such as TimeMasking and SpecAugmentation to enhance model generalization.
š Features
- High Accuracy: Achieving competitive Word Error Rates (WER) on benchmarks like Common Voice Arabic.
- Real-Time Transcription: Providing low-latency speech-to-text conversion suitable for live applications.
- Dialect Identification: Automatically detecting and adapting to various Arabic dialects for improved accuracy.
- Emotion Recognition: Integrating emotion detection capabilities for more nuanced understanding.
š Performance
Our models have demonstrated significant improvements in transcription accuracy, with recent implementations showing over 80% enhancement compared to baseline systems. This advancement positions our solutions ahead of many existing offerings in the market.
š ļø Getting Started
To utilize our Arabic speech recognition models:
Installation:
pip install transformers
Usage:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("KalamTech/whisper-small-ar-cv-11")
š Datasets
We train our models on a combination of publicly available and proprietary datasets, including:
Common Voice Arabic: A multilingual dataset with diverse Arabic speech samples.
ADI-5: Contains recordings from various Arabic dialects.
MGB-3: Features Egyptian Arabic speech from diverse sources.
š¤ Collaborations We actively seek partnerships with academic institutions and industry leaders to further research and development in Arabic speech technologies. If you're interested in collaborating, please reach out to us.
š« Contact:
Email: [email protected]
Website: https://kalam.se
Empowering Arabic communication through cutting-edge speech recognition.