fdelucaf commited on
Commit
ded054f
1 Parent(s): 9c4a057

Add base model

Browse files
Files changed (1) hide show
  1. README.md +7 -2
README.md CHANGED
@@ -9,6 +9,8 @@ metrics:
9
  - bleu
10
  - chrf
11
  library_name: transformers
 
 
12
  ---
13
  ## Projecte Aina’s Spanish-Aranese machine translation model
14
 
@@ -18,7 +20,10 @@ This model was created as part of the participation of Language Technologies Uni
18
  [Translation into Low-Resource Languages of Spain](https://www2.statmt.org/wmt24/romance-task.html).
19
  It results from a full fine-tuning of the NLLB-200-600M model with a Spanish-Aranese corpus.
20
  Specifically, we used the [transformers library](https://huggingface.co/docs/transformers/) from Hugging Face and a filtered version
21
- of the [Spanish-Aranese dataset](https://huggingface.co/datasets/projecte-aina/ES-OC_Parallel_Corpus) to fine-tune the model. Since the original NLLB-200-600M doesn't support Aranese, we added a new token ("arn_Latn") to enable translation into Aranese. This language tag helps the model recognize the source and target languages for translation. The model was evaluated using the Flores+ evaluation datasets. Please refer to the [paper](__poner_link___) for more information.
 
 
 
22
 
23
  ## Intended uses and limitations
24
 
@@ -64,7 +69,7 @@ For further information, please send an email to <[email protected]>.
64
  Copyright(c) 2024 by Language Technologies Unit, Barcelona Supercomputing Center.
65
 
66
  ### License
67
- [CC-BY-NC](__poner_link___)
68
 
69
  ### Disclaimer
70
 
 
9
  - bleu
10
  - chrf
11
  library_name: transformers
12
+ base_model:
13
+ - facebook/nllb-200-distilled-600M
14
  ---
15
  ## Projecte Aina’s Spanish-Aranese machine translation model
16
 
 
20
  [Translation into Low-Resource Languages of Spain](https://www2.statmt.org/wmt24/romance-task.html).
21
  It results from a full fine-tuning of the NLLB-200-600M model with a Spanish-Aranese corpus.
22
  Specifically, we used the [transformers library](https://huggingface.co/docs/transformers/) from Hugging Face and a filtered version
23
+ of the [Spanish-Aranese dataset](https://huggingface.co/datasets/projecte-aina/ES-OC_Parallel_Corpus) to fine-tune the model.
24
+ Since the original NLLB-200-600M doesn't support Aranese, we added a new token ("arn_Latn") to enable translation into Aranese.
25
+ This language tag helps the model recognize the source and target languages for translation.
26
+ The model was evaluated using the Flores+ evaluation datasets. Please refer to the [paper](__poner_link___) for more information.
27
 
28
  ## Intended uses and limitations
29
 
 
69
  Copyright(c) 2024 by Language Technologies Unit, Barcelona Supercomputing Center.
70
 
71
  ### License
72
+ [CC-BY-NC-4.0](https://creativecommons.org/licenses/by-nc/4.0/)
73
 
74
  ### Disclaimer
75