File size: 4,350 Bytes

d5d5cb3
 
 
 
 
 
 
 
 
 
933503c
69a916a
d5d5cb3
933503c
 
 
d5d5cb3
 
 
 
 
933503c
 
d5d5cb3
 
933503c
d5d5cb3
 
 
933503c
d5d5cb3
 
 
 
 
933503c
 
 
 
d5d5cb3
933503c
d5d5cb3
933503c
d5d5cb3
 
 
933503c
 
 
d5d5cb3
 
 
933503c
 
 
d5d5cb3
 
 
 
933503c
 
d5d5cb3
 
 
933503c
d5d5cb3
 
933503c
d5d5cb3
933503c
 
 
d5d5cb3
933503c
 
 
 
 
 
 
d5d5cb3
933503c
 
d5d5cb3
 
 
 
 
 
933503c
d5d5cb3
 
 
 
 
933503c
 
 
d5d5cb3
 
 
933503c
d5d5cb3
933503c
 
d5d5cb3
933503c
d5d5cb3
933503c
 
 
 
d5d5cb3
 
 
933503c
d5d5cb3
 
 
933503c
d5d5cb3
 
 
933503c
d5d5cb3
 
 
933503c
 
d5d5cb3
 
 
933503c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d5d5cb3
 
 
933503c
d5d5cb3
 
 
933503c

---
license: mit
language:
- en
- hi
- el
metrics:
- bleu
base_model:
- facebook/m2m100_418M
library_name: adapter-transformers
pipeline_tag: text2text-generation
---
# Model Card for aktheroy/FT_Translate_en_el_hi

This model is a fine-tuned version of `facebook/m2m100_418M`, designed for multilingual translation tasks between English (en), Greek (el), and Hindi (hi). The model achieves efficient translation by leveraging the M2M100 architecture, which supports many-to-many language translation.

## Model Details

### Model Description

- **Developed by:** Aktheroy
- **Model type:** Transformer-based encoder-decoder
- **Language(s) (NLP):** English, Hindi, Greek
- **License:** MIT
- **Finetuned from model:** facebook/m2m100_418M

### Model Sources

- **Repository:** [aktheroy/FT_Translate_en_el_hi](https://huggingface.co/aktheroy/FT_Translate_en_el_hi)

## Uses

### Direct Use

The model can be used for translation tasks between the supported languages (English, Hindi, Greek). Use cases include:
- Cross-lingual communication
- Multilingual content generation
- Language learning assistance

### Downstream Use

The model can be fine-tuned further for domain-specific translation tasks, such as medical or legal translations.

### Out-of-Scope Use

The model is not suitable for:
- Translating unsupported languages
- Generating content for sensitive or harmful purposes

## Bias, Risks, and Limitations

While the model supports multilingual translations, it might exhibit:
- Biases from the pretraining and fine-tuning datasets.
- Reduced performance for idiomatic expressions or cultural nuances.

### Recommendations

Users should:
- Verify translations, especially for critical applications.
- Use supplementary tools to validate outputs in sensitive scenarios.

## How to Get Started with the Model

Here is an example of how to use the model for translation tasks:

```python
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_name = "aktheroy/FT_Translate_en_el_hi"
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Example input
input_text = "Hello, how are you?"
tokenizer.src_lang = "en"
tokenizer.tgt_lang = "hi"

# Tokenize and generate output
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
translation = tokenizer.batch_decode(outputs, skip_special_tokens=True)
print(translation)
```

## Training Details

### Training Data

The model was fine-tuned on a custom dataset containing parallel translations between English, Hindi, and Greek.

### Training Procedure

#### Preprocessing

The dataset was preprocessed to:
- Normalize text.
- Tokenize using the M2M100 tokenizer.

#### Training Hyperparameters

- **Epochs:** 10
- **Batch size:** 16
- **Learning rate:** 5e-5
- **Mixed Precision:** Disabled (FP32 used)

#### Speeds, Sizes, Times

- **Training runtime:** 20.3 hours
- **Training samples per second:** 17.508
- **Training steps per second:** 0.137
- **Final training loss:** 0.873

## Evaluation

### Testing Data, Factors & Metrics

#### Testing Data

The model was evaluated on a held-out test set from the same domains as the training data.

#### Metrics

- BLEU score (to be computed during final evaluation).

### Results

- **Training Loss:** 0.873
- Detailed BLEU score results will be provided in subsequent updates.

## Environmental Impact

- **Hardware Type:** MacBook with M3 Pro
- **Hours used:** 20.3 hours
- **Cloud Provider:** Local hardware
- **Carbon Emitted:** Minimal (local training)

## Technical Specifications

### Model Architecture and Objective

The model is based on the M2M100 architecture, a transformer-based encoder-decoder model designed for multilingual translation without relying on English as an intermediary language.

### Compute Infrastructure

#### Hardware

- **Device:** MacBook with M3 Pro

#### Software

- Transformers library from Hugging Face
- Python 3.12

## Citation

If you use this model, please cite it as:

**APA:**
Aktheroy (2025). Fine-Tuned M2M100 Translation Model. Hugging Face. Retrieved from [https://huggingface.co/aktheroy/FT_Translate_en_el_hi](https://huggingface.co/aktheroy/FT_Translate_en_el_hi)

## Model Card Authors

- Aktheroy

## Model Card Contact

For questions or feedback, contact the author via Hugging Face.