Your Name

pushed model weights to atlasia

ec12375 6 days ago

6.27 kB

	---
	library_name: transformers
	license: cc-by-nc-4.0
	base_model: facebook/nllb-200-3.3B
	metrics:
	- bleu
	- chrf
	- ter
	model-index:
	- name: Terjman-Supreme-v2.0
	results: []
	datasets:
	- BounharAbdelaziz/Terjman-v2-English-Darija-Dataset-350K
	language:
	- ary
	- en
	pipeline_tag: translation
	---

	# 🇲🇦 Terjman-Supreme-v2.0 (3.3B) 🚀

	Terjman-Ultra-v2.0 is an improved version of [atlasia/Terjman-Ultra-v1](https://huggingface.co/atlasia/Terjman-Ultra-v1), built on the powerful Transformer architecture and fine-tuned for high-quality, accurate translations.

	This version is still based on [facebook/nllb-200-3.3B](https://huggingface.co/facebook/nllb-200-3.3B) but has been trained on a larger and more refined dataset, leading to improved translation performance. The model achieves results on par with gpt-4o-2024-08-06 on [TerjamaBench](https://huggingface.co/datasets/atlasia/TerjamaBench), an evaluation benchmark for English-Moroccan darija translation models, that challenges the models more on the cultural aspect.


	## 🚀 Features

	✅ Fine-tuned for English->Moroccan darija translation.
	✅ State-of-the-art performance among open-source models.
	✅ Compatible with 🤗 Transformers and easily deployable on various hardware setups.


	## 🔥 Performance Comparison

	The following table compares Terjman-Supreme-v2.0 against proprietary and open-source models using BLEU, chrF, and TER scores. Higher BLEU/chrF and lower TER indicate better translation quality.

	\| Model \| Size \| BLEU↑ \| chrF↑ \| TER↓ \|
	\|------------\|------\|-------\|-------\|------\|
	\| Proprietary Models \| \| \| \| \|
	\| gemini-exp-1206 \| * \| 30.69 \| 54.16 \| 67.62 \|
	\| claude-3-5-sonnet-20241022 \| * \| 30.51 \| 51.80 \| 67.42 \|
	\| gpt-4o-2024-08-06 \| * \| 28.30 \| 50.13 \| 71.77 \|
	\| Open-Source Models \| \| \| \| \|
	\| Terjman-Ultra-v2.0\| 1.3B \| 25.00 \| 44.70 \| 77.20 \|
	\| Terjman-Supreme-v2.0 (This model) \| 3.3B \| 23.43 \| 44.57 \| 78.17 \|
	\| Terjman-Large-v2.0 \| 240M \| 22.67 \| 42.57 \| 83.00 \|
	\| Terjman-Nano-v2.0 \| 77M \| 18.84 \| 38.41 \| 94.73 \|
	\| atlasia/Terjman-Large-v1.2 \| 240M \| 16.33 \| 37.10 \| 89.13 \|
	\| MBZUAI-Paris/Atlas-Chat-9B \| 9B \| 14.80 \| 35.26 \| 93.95 \|
	\| facebook/nllb-200-3.3B \| 3.3B \| 14.76 \| 34.17 \| 94.33 \|
	\| atlasia/Terjman-Nano \| 77M \| 09.98 \| 26.55 \| 106.49 \|


	## 🔬 Model Details

	- Base Model: [facebook/nllb-200-3.3B](https://huggingface.co/facebook/nllb-200-3.3B)
	- Architecture: Transformer-based sequence-to-sequence model
	- Training Data: High-quality parallel corpora with high quality translations
	- Training Precision: FP16 for efficient inference

	## 🚀 How to Use

	You can use the model with the Hugging Face Transformers library:

	```python
	from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

	model_name = "BounharAbdelaziz/Terjman-Supreme-v2.0"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

	def translate(text, src_lang="eng_Latn", tgt_lang="ary_Arab"):
	inputs = tokenizer(text, return_tensors="pt", src_lang=src_lang, tgt_lang=tgt_lang)
	output = model.generate(**inputs)
	return tokenizer.decode(output[0], skip_special_tokens=True)

	# Example translation
	text = "Hello there! Today the weather is so nice in Geneva, couldn't ask for more to enjoy the holidays :)"
	translation = translate(text)
	print("Translation:", translation)
	# prints: صباح الخير! اليوم الطقس زوين بزاف فجنيف، ما قدرتش نطلب أكثر باش نتمتع بالعطلة :)
	```


	## 🖥️ Deployment

	### Run in a Hugging Face Space
	Try the model interactively in the [Terjman-Ultra Space](https://huggingface.co/spaces/BounharAbdelaziz/Terjman-Ultra-v2.0) 🤗

	### Use with Text Generation Inference (TGI)
	For fast inference, use Hugging Face TGI:

	```bash
	pip install text-generation
	text-generation-launcher --model-id BounharAbdelaziz/Terjman-Supreme-v2.0
	```

	### Run Locally with Transformers & PyTorch
	```bash
	pip install transformers torch
	python -c "from transformers import pipeline; print(pipeline('translation', model='BounharAbdelaziz/Terjman-Supreme-v2.0')('Hello there!'))"
	```

	### Deploy on an API Server
	Use FastAPI to serve translations as an API:

	```python
	from fastapi import FastAPI
	from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

	app = FastAPI()
	model_name = "BounharAbdelaziz/Terjman-Supreme-v2.0"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

	@app.get("/translate/")
	def translate(text: str):
	inputs = tokenizer(text, return_tensors="pt", src_lang="eng_Latn", tgt_lang="ary_Arab")
	output = model.generate(**inputs)
	return {"translation": tokenizer.decode(output[0], skip_special_tokens=True)}
	```


	## 🛠️ Training Details Hyperparameters**

	The model was fine-tuned using the following training settings:

	- Learning Rate: `0.0005`
	- Training Batch Size: `1`
	- Evaluation Batch Size: `1`
	- Seed: `42`
	- Gradient Accumulation Steps: `64`
	- Total Effective Batch Size: `64`
	- Optimizer: `AdamW (Torch)` with `betas=(0.9,0.999)`, `epsilon=1e-08`
	- Learning Rate Scheduler: `Linear`
	- Warmup Ratio: `0.1`
	- Epochs: `3`
	- Precision: `Mixed FP16` for efficient training

	## 📜 License

	This model is released under the CC BY-NC (Creative Commons Attribution-NonCommercial) license, meaning it can be used for research and personal projects but not for commercial purposes. For commercial use, please get in touch :)

	### Framework versions

	- Transformers 4.47.1
	- Pytorch 2.5.1+cu124
	- Datasets 3.1.0
	- Tokenizers 0.21.0

	```bibtex
	@misc{terjman-v2,
	title = {Terjman-v2: High-Quality English-Moroccan Darija Translation Model},
	author={Abdelaziz Bounhar},
	year={2025},
	howpublished = {\url{https://huggingface.co/BounharAbdelaziz/Terjman-Supreme-v2.0}},
	license = {CC BY-NC}
	}
	```