barbaroo
/

nllb_200_600M_en_fo

text2text-generation

Inference Endpoints

Model card Files Files and versions Community

nllb_200_600M_en_fo / README.md

barbaroo's picture

Update README.md

b6b80f0 verified 13 days ago

|

history blame contribute delete

2.71 kB

	---
	license: cc-by-nc-4.0
	datasets:
	- barbaroo/Sprotin_parallel
	- barbaroo/fo_en_synthetic
	language:
	- en
	- fo
	metrics:
	- bleu
	- chrf
	- bertscore
	base_model:
	- facebook/nllb-200-distilled-600M
	pipeline_tag: translation
	---
	# barbaroo/nllb_200_600M_en_fo

	## Model Description

	- Model Architecture: This model is based on the [NLLB 600M architecture](https://huggingface.co/facebook/nllb-200-distilled-600M) and weights.
	- Languages: This checkpoint is fine-tuned to translate from English (`en`) to Faroese (`fo`).
	- Size: ~600M parameters.
	- Finetuning Datasets:
	- [Sprotin_parallel](https://huggingface.co/datasets/barbaroo/Sprotin_parallel)
	- [fo_en_synthetic](https://huggingface.co/datasets/barbaroo/fo_en_synthetic)
	- Training Regime: Trained until convergence (about 2 epochs).
	- License: Inherits the original licenses of the [NLLB 600M model](https://huggingface.co/facebook/nllb-200-distilled-600M).

	## Intended Use

	- Primary Use Case: Translate text from English to Faroese.
	- Audience: Researchers, developers, or anyone interested in Faroese language processing.
	- Usage Scenarios:
	- Building Faroese-English translation tools
	- Language research and corpus analysis
	- Synthetic data creation

	> Important: While the model can produce fluent translations, it is not guaranteed to be perfectly accurate on all inputs. Users should verify critical or sensitive content through human experts.


	## Metrics

	- Model performance measures:
	NLLB-200 model was evaluated using BLEU, chrF and BERT-score —metrics widely adopted by the machine translation community.
	---

	## Evaluation Data

	- Datasets:
	Flores-200 dataset is described in Section 4 of the NLLB paper/documentation.
	- Motivation:
	Flores-200 is currently the only machine translation benchmark available for Faroese.

	## How to Use

	Below is a simple usage example in Python with [Hugging Face Transformers](https://github.com/huggingface/transformers):

	```python
	from transformers import pipeline

	model_name = "barbaroo/nllb_200_600M_en_fo"

	translator = pipeline(
	"translation",
	model=model_name,
	tokenizer=model_name,
	src_lang="eng_Latn", # Language code for English
	tgt_lang="fao_Latn" # Language code for Faroese
	)

	text = "Hello, how are you?"
	translation = translator(text)
	print(translation)
	```

	## Citation

	If you use this model or find it helpful in your research, please cite: [COMING SOON]

	## Contact

	For questions, feedback, or collaboration inquiries, feel free to reach out:

	- Primary Contact: < Barbara Scalvini/ [email protected] / [email protected] >