File size: 4,350 Bytes
d5d5cb3 933503c 69a916a d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c d5d5cb3 933503c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 |
---
license: mit
language:
- en
- hi
- el
metrics:
- bleu
base_model:
- facebook/m2m100_418M
library_name: adapter-transformers
pipeline_tag: text2text-generation
---
# Model Card for aktheroy/FT_Translate_en_el_hi
This model is a fine-tuned version of `facebook/m2m100_418M`, designed for multilingual translation tasks between English (en), Greek (el), and Hindi (hi). The model achieves efficient translation by leveraging the M2M100 architecture, which supports many-to-many language translation.
## Model Details
### Model Description
- **Developed by:** Aktheroy
- **Model type:** Transformer-based encoder-decoder
- **Language(s) (NLP):** English, Hindi, Greek
- **License:** MIT
- **Finetuned from model:** facebook/m2m100_418M
### Model Sources
- **Repository:** [aktheroy/FT_Translate_en_el_hi](https://huggingface.co/aktheroy/FT_Translate_en_el_hi)
## Uses
### Direct Use
The model can be used for translation tasks between the supported languages (English, Hindi, Greek). Use cases include:
- Cross-lingual communication
- Multilingual content generation
- Language learning assistance
### Downstream Use
The model can be fine-tuned further for domain-specific translation tasks, such as medical or legal translations.
### Out-of-Scope Use
The model is not suitable for:
- Translating unsupported languages
- Generating content for sensitive or harmful purposes
## Bias, Risks, and Limitations
While the model supports multilingual translations, it might exhibit:
- Biases from the pretraining and fine-tuning datasets.
- Reduced performance for idiomatic expressions or cultural nuances.
### Recommendations
Users should:
- Verify translations, especially for critical applications.
- Use supplementary tools to validate outputs in sensitive scenarios.
## How to Get Started with the Model
Here is an example of how to use the model for translation tasks:
```python
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
model_name = "aktheroy/FT_Translate_en_el_hi"
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Example input
input_text = "Hello, how are you?"
tokenizer.src_lang = "en"
tokenizer.tgt_lang = "hi"
# Tokenize and generate output
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
translation = tokenizer.batch_decode(outputs, skip_special_tokens=True)
print(translation)
```
## Training Details
### Training Data
The model was fine-tuned on a custom dataset containing parallel translations between English, Hindi, and Greek.
### Training Procedure
#### Preprocessing
The dataset was preprocessed to:
- Normalize text.
- Tokenize using the M2M100 tokenizer.
#### Training Hyperparameters
- **Epochs:** 10
- **Batch size:** 16
- **Learning rate:** 5e-5
- **Mixed Precision:** Disabled (FP32 used)
#### Speeds, Sizes, Times
- **Training runtime:** 20.3 hours
- **Training samples per second:** 17.508
- **Training steps per second:** 0.137
- **Final training loss:** 0.873
## Evaluation
### Testing Data, Factors & Metrics
#### Testing Data
The model was evaluated on a held-out test set from the same domains as the training data.
#### Metrics
- BLEU score (to be computed during final evaluation).
### Results
- **Training Loss:** 0.873
- Detailed BLEU score results will be provided in subsequent updates.
## Environmental Impact
- **Hardware Type:** MacBook with M3 Pro
- **Hours used:** 20.3 hours
- **Cloud Provider:** Local hardware
- **Carbon Emitted:** Minimal (local training)
## Technical Specifications
### Model Architecture and Objective
The model is based on the M2M100 architecture, a transformer-based encoder-decoder model designed for multilingual translation without relying on English as an intermediary language.
### Compute Infrastructure
#### Hardware
- **Device:** MacBook with M3 Pro
#### Software
- Transformers library from Hugging Face
- Python 3.12
## Citation
If you use this model, please cite it as:
**APA:**
Aktheroy (2025). Fine-Tuned M2M100 Translation Model. Hugging Face. Retrieved from [https://huggingface.co/aktheroy/FT_Translate_en_el_hi](https://huggingface.co/aktheroy/FT_Translate_en_el_hi)
## Model Card Authors
- Aktheroy
## Model Card Contact
For questions or feedback, contact the author via Hugging Face. |