text2gloss_ar
This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-ar on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.0306
- Word Bleu: 97.0831
- Char Bleu: 98.9391
Model description
- Source: Text (spoken text)
- Target: gloss (ArSL gloss)
- Domain: ArSL Friday sermon translation from text to gloss We used a pre-trained model (apus_mt) for domain specification.
Intended uses & limitations
Data Specificity: The model is trained specifically on Arabic text and ArSL glosses. It may not perform well when applied to other languages or sign languages.
Contextual Accuracy: While the model handles straightforward translations effectively, it might struggle with complex sentences or phrases that require a deep understanding of context, especially when combining or shuffling sentences.
Generalization to Unseen Data: The model’s performance may degrade when exposed to text that significantly differs in style or content from the training data, such as highly specialized jargon or informal language.
Gloss Representation: The model translates text into glosses, which are a written representation of sign language but do not capture the full complexity of sign language grammar and non-manual signals (facial expressions, body language).
Test Dataset Limitations: The test dataset used is a shortened version of a sermon that does not cover all possible sentence structures and contexts, which may limit the model’s ability to generalize to other domains.
Ethical Considerations: Care must be taken when deploying this model in real-world applications, as misinterpretations or inaccuracies in translation can lead to misunderstandings, especially in sensitive communications.
Training and evaluation data
- Dataset size before augmentation: 131
- Dataset size after augmentation: 8646
- (For training and validation): Augmented Dataset Splitter:
- train: 7349
- validation: 1297
- (For testing): We used a dataset that contained the actual scenario of the Friday sermon phrases to generate a short Friday sermon.
Training procedure
1- Train and Evaluation Result:
- Train and Evaluation Loss: 0.464023
- Train and Evaluation Word BLEU Score: 97.08
- Train and Evaluation Char BLEU Score: 98.94
- Train and Evaluation Runtime (seconds): 562.8277
- Train and Evaluation Samples per Second: 391.718
- Train and Evaluation Steps per Second: 12.26
- Test Results:
2- Test Loss: 0.289312
- Test Word BLEU Score: 76.92
- Test Char BLEU Score: 86.30
- Test Runtime (seconds): 1.1038
- Test Samples per Second: 41.67
- Test Steps per Second: 0.91
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 30
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Word Bleu | Char Bleu |
---|---|---|---|---|---|
2.726 | 1.0 | 230 | 0.8206 | 24.8561 | 42.0470 |
0.6983 | 2.0 | 460 | 0.3166 | 61.8643 | 74.7375 |
0.3167 | 3.0 | 690 | 0.1288 | 85.4787 | 92.1539 |
0.1599 | 4.0 | 920 | 0.0699 | 92.9287 | 97.2020 |
0.0971 | 5.0 | 1150 | 0.0504 | 94.6364 | 97.6967 |
0.0626 | 6.0 | 1380 | 0.0383 | 96.3441 | 98.6000 |
0.0507 | 7.0 | 1610 | 0.0396 | 95.9440 | 98.5028 |
0.036 | 8.0 | 1840 | 0.0364 | 96.0036 | 98.3957 |
0.0289 | 9.0 | 2070 | 0.0306 | 97.0831 | 98.9391 |
Framework versions
- Transformers 4.42.4
- Pytorch 1.12.0+cu102
- Datasets 2.21.0
- Tokenizers 0.19.1
- Downloads last month
- 5
Model tree for sabbas/Text2Gloss_ar
Base model
Helsinki-NLP/opus-mt-en-ar