fdelucaf commited on
Commit
dd761c0
1 Parent(s): ba27e8d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -12
README.md CHANGED
@@ -14,13 +14,20 @@ library_name: transformers
14
 
15
  ## Model description
16
 
17
- This model was created during BSC's participation in the Shared Task: Translation into Low-Resource Languages of Spain (WMT24). It results from a full fine-tuning of the NLLB-200-600M model with a Spanish-Asturian corpus. Specfically, we used the [transformers library](https://huggingface.co/docs/transformers/) from Hugging Face and a filtered version of the [Spanish-Asturian dataset](__poner_link___) to fine-tune the model. The model was evaluated using the Flores evaluation datasets. Please refer to the [paper](__poner_link___) for more information.
 
 
 
 
 
 
18
 
19
  ## Intended uses and limitations
20
 
21
  You can use this model for machine translation from Spanish to Asturian.
22
 
23
  ## Limitations and bias
 
24
  At the time of submission, no measures have been taken to estimate the bias and toxicity embedded in the model.
25
  However, we are well aware that our models may be biased. We intend to conduct research in these areas in the future, and if completed, this model card will be updated.
26
 
@@ -32,18 +39,21 @@ We use the BLEU and ChrF score for evaluation on the [Flores+](https://github.co
32
 
33
  ### Evaluation results
34
 
35
- Below are the evaluation results on the machine translation from Spanish to Asturian compared to [Apertium](https://www.apertium.org/), [Eslema](https://eslema.it.uniovi.es/) and [NLLB-200-600M](https://huggingface.co/facebook/nllb-200-distilled-600M):
 
 
 
 
 
 
 
36
 
 
 
 
 
37
 
38
- | Test set (BLEU) | Apertium | Eslema | NLLB-600M | Our model
39
- |----------------------|------------|------------------|---------------|
40
- | Flores dev | 16.66 | 17.30 | 17.23 | **19.33** |
41
- | Flores devtest | 16.99 | 17.17 | 16.21 | **18.43** |
42
 
43
- | Test set (ChrF) | Apertium | Eslema | NLLB-600M | Our model
44
- |----------------------|------------|------------------|---------------|
45
- | Flores dev | 50.57 | 50.77 | 49.72 | **52.26** |
46
- | Flores devtest | 50.84 | 50.91 | 49.05 | **52.14** |
47
 
48
  ## Additional information
49
 
@@ -60,14 +70,14 @@ For further information, please send an email to <[email protected]>.
60
  Copyright(c) 2024 by Language Technologies Unit, Barcelona Supercomputing Center.
61
 
62
  ### License
63
- [CC-BY-NC](__poner_link___)
64
 
65
  ### Disclaimer
66
 
67
  <details>
68
  <summary>Click to expand</summary>
69
 
70
- The model published in this repository is intended for a generalist purpose and is available to third parties under a permissive Apache License, Version 2.0.
71
 
72
  Be aware that the model may have biases and/or any other undesirable distortions.
73
 
 
14
 
15
  ## Model description
16
 
17
+ This model was created as part of the participation of Language Technologies Unit at BSC in the WMT24 Shared Task:
18
+ [Translation into Low-Resource Languages of Spain](https://www2.statmt.org/wmt24/romance-task.html).
19
+ It results from a full fine-tuning of the NLLB-200-600M model with a Spanish-Asturian corpus.
20
+ Specifically, we used the [transformers library](https://huggingface.co/docs/transformers/) from Hugging Face and a filtered version
21
+ of the [Spanish-Asturian dataset](https://huggingface.co/datasets/projecte-aina/ES-AST_Parallel_Corpus) to fine-tune the model.
22
+ The model was evaluated using the Flores evaluation datasets.
23
+ Please refer to the [paper](__poner_link___) for more information.
24
 
25
  ## Intended uses and limitations
26
 
27
  You can use this model for machine translation from Spanish to Asturian.
28
 
29
  ## Limitations and bias
30
+
31
  At the time of submission, no measures have been taken to estimate the bias and toxicity embedded in the model.
32
  However, we are well aware that our models may be biased. We intend to conduct research in these areas in the future, and if completed, this model card will be updated.
33
 
 
39
 
40
  ### Evaluation results
41
 
42
+ Below are the evaluation results on the machine translation from Spanish to Asturian compared to [Apertium](https://www.apertium.org/),
43
+ [Eslema](https://eslema.it.uniovi.es/) and [NLLB-200-600M](https://huggingface.co/facebook/nllb-200-distilled-600M):
44
+
45
+
46
+ | Test set (BLEU) | Apertium | Eslema | NLLB-600M | Our model |
47
+ |:---------------------|:---------|:-------|:----------|:-----------|
48
+ | Flores dev | 16.66 | 17.30 | 17.23 | **19.33** |
49
+ | Flores devtest | 16.99 | 17.17 | 16.21 | **18.43** |
50
 
51
+ | Test set (ChrF) | Apertium | Eslema | NLLB-600M | Our model |
52
+ |:---------------------|:---------|:-------|:----------|:-----------|
53
+ | Flores dev | 50.57 | 50.77 | 49.72 | **52.26** |
54
+ | Flores devtest | 50.84 | 50.91 | 49.05 | **52.14** |
55
 
 
 
 
 
56
 
 
 
 
 
57
 
58
  ## Additional information
59
 
 
70
  Copyright(c) 2024 by Language Technologies Unit, Barcelona Supercomputing Center.
71
 
72
  ### License
73
+ [CC-BY-NC-4.0](https://creativecommons.org/licenses/by-nc/4.0/)
74
 
75
  ### Disclaimer
76
 
77
  <details>
78
  <summary>Click to expand</summary>
79
 
80
+ The model published in this repository is intended for a generalist purpose and is available to third parties under a CC BY-NC 4.0 licence.
81
 
82
  Be aware that the model may have biases and/or any other undesirable distortions.
83