laurentiubp commited on
Commit
2ae2d8f
·
verified ·
1 Parent(s): a8af409

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +110 -27
README.md CHANGED
@@ -1,59 +1,142 @@
1
  ---
2
- license: other
3
- base_model: laurentiubp/CataLLaMA-v0.2.0
4
  tags:
5
- - trl
6
- - dpo
7
- - generated_from_trainer
8
  model-index:
9
- - name: CataLLaMA-v0.2.6
10
  results: []
 
 
 
 
 
 
11
  ---
12
 
13
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
- should probably proofread and complete it, then remove this comment. -->
15
 
16
- # CataLLaMA-v0.2.6
17
 
18
- This model is a fine-tuned version of [laurentiubp/CataLLaMA-v0.2.0](https://huggingface.co/laurentiubp/CataLLaMA-v0.2.0) on an unknown dataset.
19
 
20
- ## Model description
 
 
 
 
 
 
21
 
22
- More information needed
23
 
24
- ## Intended uses & limitations
25
 
26
- More information needed
27
 
28
- ## Training and evaluation data
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
 
30
- More information needed
31
 
32
  ## Training procedure
33
 
 
 
 
 
 
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
  - learning_rate: 5e-07
38
- - train_batch_size: 4
39
- - eval_batch_size: 4
40
- - seed: 42
41
  - distributed_type: multi-GPU
42
  - num_devices: 4
43
- - total_train_batch_size: 16
44
- - total_eval_batch_size: 16
45
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
46
  - lr_scheduler_type: linear
47
  - lr_scheduler_warmup_steps: 100
48
  - num_epochs: 1
49
 
50
- ### Training results
51
 
 
 
 
52
 
 
53
 
54
- ### Framework versions
55
 
56
- - Transformers 4.38.1
57
- - Pytorch 2.1.0+cu118
58
- - Datasets 2.16.1
59
- - Tokenizers 0.15.2
 
1
  ---
2
+ license: llama3
3
+ base_model: catallama/CataLlama-v0.1-Instruct-SFT
4
  tags:
5
+ - llama
6
+ - llama-3
7
+ - Catalan
8
  model-index:
9
+ - name: catallama/CataLlama-v0.1-Instruct-DPO
10
  results: []
11
+ datasets:
12
+ - catallama/Catalan-DPO
13
+ language:
14
+ - ca
15
+ - en
16
+ pipeline_tag: text-generation
17
  ---
18
 
19
+ **catallama/CataLlama-v0.1-Instruct-DPO** is a DPO fine-tune of [catallama/CataLlama-v0.1-Instruct-SFT](https://huggingface.co/catallama/CataLlama-v0.1-Instruct-SFT) on the [catallama/Catalan-DPO](https://huggingface.co/datasets/catallama/Catalan-DPO) dataset.
 
20
 
21
+ The model shows improved proficiency with the Catalan language.
22
 
23
+ **This is an instruction fine-tuned model, optimised with DPO, proficient on the following tasks in Catalan**
24
 
25
+ - *Information extraction (suitable for RAG)*
26
+ - *Named Entity Recognition (NER)*
27
+ - *Translation from English to Catalan and Catalan to English*
28
+ - *Summarization - both short form and long form*
29
+ - *Chat*
30
+ - *Sentiment analysis*
31
+ - *Open question answering*
32
 
33
+ **Model developers** [Laurentiu Petrea](https://www.linkedin.com/in/laurentiupetrea/) based on Llama-3 from Meta.
34
 
35
+ **Model Architecture** CataLlama is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and direct preference optimisation (DPO) to align with human preferences for helpfulness and safety.
36
 
37
+ **License** The model uses the llama-3 license available at: [https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license)
38
 
39
+ ### Use with transformers
40
+
41
+ See the snippet below for usage with Transformers:
42
+
43
+ **The model follows the same prompt template as Llama-3 Instruct**
44
+
45
+ ```python
46
+ import transformers
47
+ import torch
48
+
49
+ model_id = "catallama/CataLlama-v0.1-Base"
50
+
51
+ pipeline = transformers.pipeline(
52
+ "text-generation",
53
+ model=model_id,
54
+ model_kwargs={"torch_dtype": torch.bfloat16},
55
+ device_map="auto",
56
+ )
57
+
58
+ messages = [
59
+ {"role": "user", "content": "Ei com estàs avui?"},
60
+ ]
61
+
62
+ prompt = pipeline.tokenizer.apply_chat_template(
63
+ messages,
64
+ tokenize=False,
65
+ add_generation_prompt=True
66
+ )
67
+
68
+ outputs = pipeline(
69
+ prompt,
70
+ max_new_tokens=1024,
71
+ do_sample=True,
72
+ temperature=0.6,
73
+ top_p=0.9,
74
+ )
75
+
76
+ print(outputs[0]["generated_text"][len(prompt):])
77
+ ```
78
+
79
+ **Examples**
80
+
81
+ ```
82
+ Prompt: Pot Donald Trump sopar amb Juli Cèsar?
83
+
84
+ Answer: No, Donald Trump no podria sopar amb Juli Cèsar, ja que Juli Cèsar va morir fa més de 2.000 anys i, per tant, no està viu per a poder sopar amb ell. A més, Juli Cèsar va ser un personatge històric i polític de l'antiga Roma, mentre que Donald Trump és el 45è president dels Estats Units, personatges que pertanyen a dues èpoques i cultures diferents. Per tant, la idea de que poguessin sopar junts és absurda i no té cap sentit històric ni lògic.
85
+ ```
86
+
87
+ ```
88
+ Prompt: A quina velocitat poden volar els cocodrils?
89
+
90
+ Answer: Els cocodrils no poden volar, ja que no tenen ales ni cap estructura que els permeti elevar-se per l'aire. Aquestes criatures aquàtiques depenen de la seva força i agilitat per moure's i caçar preses a terra ferma o al riu. No obstant això, poden assolir velocitats notables quan es tracta de fugir o perseguir una presa. El cocodril més ràpid, el cocodril negre africà, pot arribar a assolir velocitats de fins a 40 km/h en terra ferma, tot i que la seva velocitat a l'aigua és molt menor.
91
+ ```
92
+
93
+ ```
94
+ *Prompt*: Explica pas a pas com resoldre l'equació següent: 2x + 10 = 0
95
+
96
+ *Answer*:
97
+ Per resoldre l'equació 2x + 10 = 0, segueix els passos següents:
98
+
99
+ 1. Aïllar la variable: per fer-ho, hem de restar 10 de tots dos costats de l'equació. Així, tenim:
100
+
101
+ 2x = -10
102
+
103
+ 2. Dividir tots dos costats per 2: això ens permet obtenir una equació en forma d'una sola incògnita. Així, tenim:
104
+
105
+ x = -5
106
+
107
+ 3. Comprovar: sempre és recomanable comprovar la solució en l'equació original. Substituint x = -5 en 2x + 10 = 0, obtenim:
108
+
109
+ 2(-5) + 10 = -10 + 10 = 0
110
+
111
+ Com que l'equació es compleix, la solució x = -5 és vàlida.
112
+ ```
113
 
 
114
 
115
  ## Training procedure
116
 
117
+ The model was trained **with the same prompt template of Llama-3 Instruct**.
118
+
119
+ The model was trained for two epochs on **4x A100 80GB GPUs using DeepSpeed ZeRO** State-3 without CPU offloading.
120
+
121
+
122
  ### Training hyperparameters
123
 
124
  The following hyperparameters were used during training:
125
  - learning_rate: 5e-07
 
 
 
126
  - distributed_type: multi-GPU
127
  - num_devices: 4
 
 
128
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
129
  - lr_scheduler_type: linear
130
  - lr_scheduler_warmup_steps: 100
131
  - num_epochs: 1
132
 
 
133
 
134
+ ## Intended Use
135
+
136
+ **Note:** This model is not intended to beat benchmarks, but to demonstrate techniques for augmenting LLMs on new languages and preserve rare languages as part of our world heritage.
137
 
138
+ **Intended Use Cases** Llama 3 is intended for commercial and research use in English. Instruction tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks.
139
 
140
+ **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3 Community License. Use in languages other than English**.
141
 
142
+ **Note: Developers may fine-tune Llama 3 models for languages beyond English provided they comply with the Llama 3 Community License and the Acceptable Use Policy.