Update README.md
Browse files
README.md
CHANGED
@@ -26,7 +26,7 @@ The tokenizer of this model after adaptation is the same as [Minverva-3B](https:
|
|
26 |
|
27 |
## Data used for the adaptation
|
28 |
|
29 |
-
The **Mistral-7B-v0.1-Adapted**
|
30 |
The data are extracted to be skewed toward Italian language with a ration of one over four. Extracting the first 9B tokens from Italian part of CulturaX and the first 3B tokens from English part of CulturaX.
|
31 |
|
32 |
|
|
|
26 |
|
27 |
## Data used for the adaptation
|
28 |
|
29 |
+
The **Mistral-7B-v0.1-Adapted** models are trained on a collection of Italian and English data extracted from [CulturaX](https://huggingface.co/datasets/uonlp/CulturaX).
|
30 |
The data are extracted to be skewed toward Italian language with a ration of one over four. Extracting the first 9B tokens from Italian part of CulturaX and the first 3B tokens from English part of CulturaX.
|
31 |
|
32 |
|