File size: 1,591 Bytes
ff27a40
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a8d6d3c
 
ff27a40
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
---
language:
- et
- ar
- de
- en
- fi
- fr
- lt
- lv
- ru
- es
- sv
- uk
- zh
metrics:
- bleu
library_name: fairseq
pipeline_tag: translation
---
# Model Card for SynEst Translation Models

The SynEst models are machine translation models focused on translating from and into the Estonian language. 

## Model Details

The models are based on the [NLLB-1.3B](https://huggingface.co/facebook/nllb-200-1.3B) multilingual model. 
The NLLB encoder is frozen, and a new, smaller decoder is trained for each target language.

## Languages

The models were trained to translate from Estonian into German, English, Finnish, Russian, Ukrainian, and Chinese,
and into Estonian from Arabic, German, English, Finnish, French, Lithuanian, Latvian, Russian, Spanish,
Swedish, Ukrainian, and Chinese.

However, as the parameters of the NLLB encoder are frozen, they are capable of translating from any of the 
NLLB languages as well, albeit likely with a lower quality than for the languages on which they were
fine-tuned.

## How to Use the Model

The easiest way to run the models is with the [dedicated branch of the TartuNLP translation worker](https://github.com/TartuNLP/translation-worker/tree/nllb-based-est) 
(place `nllb-based` with all its contents inside the `models/` directory).

<!-- ## Evaluation

#### Testing Data

* [Flores](https://huggingface.co/datasets/facebook/flores) (devtest)
* [MTee](https://github.com/Project-MTee/MTee_translation_benchmarks/tree/main/benchmark_datasets)

#### Metrics

BLEU

### Results -->

## Model Card Authors

[@lisskor](https://huggingface.co/lisskor)