Update README.md
Browse files
README.md
CHANGED
@@ -20,9 +20,9 @@ datasets:
|
|
20 |
|
21 |
**Dataset**
|
22 |
|
23 |
-
The dataset is a translation to Spanish of alpaca_data_cleaned.json (a clean version of the Alpaca dataset made at Stanford) using OpenAI's gpt-3.5-turbo model. It was translated using a full-sample prompt instead of per strings, which resulted in more coherent tuples of (instruction, input, output)
|
24 |
-
Dataset link:
|
25 |
|
26 |
**Finetuning details**
|
27 |
|
28 |
-
To fine-tune the FALCON-7B model we used the
|
|
|
20 |
|
21 |
**Dataset**
|
22 |
|
23 |
+
The dataset is a translation to Spanish of alpaca_data_cleaned.json (a clean version of the Alpaca dataset made at Stanford) using OpenAI's gpt-3.5-turbo model. This translation was made by bertin-project. It was translated using a full-sample prompt instead of per strings, which resulted in more coherent tuples of (instruction, input, output).
|
24 |
+
Dataset link: [here](https://huggingface.co/datasets/bertin-project/alpaca-spanish)
|
25 |
|
26 |
**Finetuning details**
|
27 |
|
28 |
+
To fine-tune the FALCON-7B model we used the [following code](https://github.com/AdrianBZG/LLM-distributed-finetune) to run it on a distributed cluster on AWS. You are free to use such code as a fingerprint to finetune any model as you please, as it is easily customizable.
|