metadata
license: cc-by-nc-4.0
datasets:
- barbaroo/Sprotin_parallel
- barbaroo/fo_en_synthetic
language:
- en
- fo
metrics:
- bleu
- chrf
- bertscore
base_model:
- facebook/nllb-200-distilled-600M
pipeline_tag: translation
barbaroo/nllb_200_600M_en_fo
Model Description
- Model Architecture: This model is based on the NLLB 600M architecture and weights.
- Languages: This checkpoint is fine-tuned to translate from English (
en
) to Faroese (fo
). - Size: ~600M parameters.
- Finetuning Datasets:
- Sprotin_parallel
- fo_en_synthetic
- Training Regime: Trained until convergence (about 2 epochs).
- License: Inherits the original licenses of the NLLB 600M model.
Intended Use
- Primary Use Case: Translate text from English to Faroese.
- Audience: Researchers, developers, or anyone interested in Faroese language processing.
- Usage Scenarios:
- Building Faroese-English translation tools
- Language research and corpus analysis
- Synthetic data creation
Important: While the model can produce fluent translations, it is not guaranteed to be perfectly accurate on all inputs. Users should verify critical or sensitive content through human experts.
Metrics
- Model performance measures:
NLLB-200 model was evaluated using BLEU, chrF and BERT-score —metrics widely adopted by the machine translation community.
Evaluation Data
Datasets:
Flores-200 dataset is described in Section 4 of the NLLB paper/documentation.Motivation:
Flores-200 is currently the only machine translation benchmark available for Faroese.How to Use
Below is a simple usage example in Python with Hugging Face Transformers:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline
model_name = "barbaroo/nllb_200_600M_en_fo"
translator = pipeline("translation", model=model_name, tokenizer=model_name)
text = "Hello, how are you?"
translation = translator(text)
print(translation)
Citation
If you use this model or find it helpful in your research, please cite: [COMING SOON]
Contact
For questions, feedback, or collaboration inquiries, feel free to reach out:
- Primary Contact: < Barbara Scalvini/ [email protected] / [email protected] >