metadata
license: cc-by-nc-4.0
datasets:
- barbaroo/Sprotin_parallel
- barbaroo/fo_en_synthetic
language:
- en
- fo
metrics:
- bleu
- chrf
- bertscore
base_model:
- facebook/nllb-200-distilled-1.3B
pipeline_tag: translation
barbaroo/nllb_200_1.3B_en_fo
Model Description
- Model Architecture: This model is based on the NLLB 1.3B architecture and weights.
- Languages: This checkpoint is fine-tuned to translate from English (
en
) to Faroese (fo
). - Size: ~1.3B parameters.
- Finetuning Datasets:
- Sprotin_parallel
- fo_en_synthetic
- Training Regime: Trained until convergence (about 2 epochs).
- License: Inherits the original licenses of the NLLB 1.3B model.
Intended Use
- Primary Use Case: Translate text from English to Faroese.
- Audience: Researchers, developers, or anyone interested in Faroese language processing.
- Usage Scenarios:
- Building Faroese-English translation tools
- Language research and corpus analysis
- Synthetic data creation
Important: While the model can produce fluent translations, it is not guaranteed to be perfectly accurate on all inputs. Users should verify critical or sensitive content through human experts.
Metrics
- Model performance measures:
This model was evaluated using BLEU, chrF and BERT-score —metrics widely adopted by the machine translation community. Additionally, human evaluation was performed by two human experts using the ESA framework on a small dataset (about 200 sentences) of English sentences from news outlets (BBC, CNN, Al Jazeera).
Evaluation Data
Datasets:
Flores-200 dataset is described in Section 4 of the NLLB paper/documentation.Motivation:
Flores-200 is currently the only machine translation benchmark available for Faroese.How to Use
Below is a simple usage example in Python with Hugging Face Transformers:
from transformers import pipeline
model_name = "barbaroo/nllb_200_600M_en_fo"
translator = pipeline(
"translation",
model=model_name,
tokenizer=model_name,
src_lang="eng_Latn", # Language code for English
tgt_lang="fao_Latn" # Language code for Faroese
)
text = "Hello, how are you?"
translation = translator(text)
print(translation)
Citation
If you use this model or find it helpful in your research, please cite: [COMING SOON]
Contact
For questions, feedback, or collaboration inquiries, feel free to reach out:
- Primary Contact: < Barbara Scalvini/ [email protected] / [email protected] >