chad-brouze commited on
Commit
269e35a
·
verified ·
1 Parent(s): 71afc1a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +106 -100
README.md CHANGED
@@ -1,109 +1,115 @@
 
1
  base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
 
 
2
  library_name: peft
3
- license: llama3.1
 
 
 
 
 
4
  model-index:
5
  - name: llama-8b-south-africa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
 
7
- model_description:
8
- name: llama-8b-south-africa
9
- description: |
10
- This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) on the generator dataset.
11
- [Alapa Cleaned](https://huggingface.co/datasets/yahma/alpaca-cleaned) translated into Xhose, Zulu, Tswana, Northern Sotho and Afrikaans using machine translation.
12
 
13
- details: |
14
- The model could only be evaluated in Xhosa and Zulu due to Iroko language availability. Its aim is to show cross-lingual transfer can be achieved at a low cost. Translation cost roughly $370 per language and training cost roughly $15 using an Akash Compute Network GPU.
15
-
16
- intended_use: This model is intended to be used for research.
17
-
18
- evaluation_results:
19
- - task:
20
- type: text-generation
21
- name: African Language Evaluation
22
- dataset:
23
- name: afrimgsm_direct_xho
24
- type: text-classification
25
- split: test
26
- metrics:
27
- - name: Accuracy
28
- type: accuracy
29
- value: 0.02
30
- - name: Dataset
31
- type: dataset
32
- value: MGS-Xho Direct
33
-
34
- - task:
35
- type: text-generation
36
- name: African Language Evaluation
37
- dataset:
38
- name: afrimmlu_direct_xho
39
- type: text-classification
40
- split: test
41
- metrics:
42
- - name: Accuracy
43
- type: accuracy
44
- value: 0.29
45
- - name: Dataset
46
- type: dataset
47
- value: MMLU-Xho Direct
48
-
49
- - task:
50
- type: text-generation
51
- name: African Language Evaluation
52
- dataset:
53
- name: afrixnli_en_direct_xho
54
- type: text-classification
55
- split: test
56
- metrics:
57
- - name: Accuracy
58
- type: accuracy
59
- value: 0.44
60
- - name: Dataset
61
- type: dataset
62
- value: XNLI-Xho Direct
63
-
64
- - task:
65
- type: text-generation
66
- name: African Language Evaluation
67
- dataset:
68
- name: afrimgsm_direct_zul
69
- type: text-classification
70
- split: test
71
- metrics:
72
- - name: Accuracy
73
- type: accuracy
74
- value: 0.045
75
- - name: Dataset
76
- type: dataset
77
- value: MGS-Zul Direct
78
 
79
- - task:
80
- type: text-generation
81
- name: African Language Evaluation
82
- dataset:
83
- name: afrimmlu_direct_zul
84
- type: text-classification
85
- split: test
86
- metrics:
87
- - name: Accuracy
88
- type: accuracy
89
- value: 0.29
90
- - name: Dataset
91
- type: dataset
92
- value: MMLU-Zul Direct
93
 
94
- - task:
95
- type: text-generation
96
- name: African Language Evaluation
97
- dataset:
98
- name: afrixnli_en_direct_zul
99
- type: text-classification
100
- split: test
101
- metrics:
102
- - name: Accuracy
103
- type: accuracy
104
- value: 0.43
105
- - name: Dataset
106
- type: dataset
107
- value: XNLI-Zul Direct
108
 
109
- terms_of_use: This model is governed by a Apache 2.0 License.
 
 
 
 
 
 
 
1
+ ---
2
  base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
3
+ datasets:
4
+ - generator
5
  library_name: peft
6
+ license: apache-2.0
7
+ tags:
8
+ - trl
9
+ - sft
10
+ - generated_from_trainer
11
+ - african-languages
12
  model-index:
13
  - name: llama-8b-south-africa
14
+ results:
15
+ - task:
16
+ type: text-generation
17
+ name: African Language Evaluation
18
+ dataset:
19
+ name: afrimgsm_direct_xho
20
+ type: text-classification
21
+ split: test
22
+ metrics:
23
+ - name: Accuracy
24
+ type: accuracy
25
+ value: 0.02
26
+ - task:
27
+ type: text-generation
28
+ name: African Language Evaluation
29
+ dataset:
30
+ name: afrimgsm_direct_zul
31
+ type: text-classification
32
+ split: test
33
+ metrics:
34
+ - name: Accuracy
35
+ type: accuracy
36
+ value: 0.045
37
+ - task:
38
+ type: text-generation
39
+ name: African Language Evaluation
40
+ dataset:
41
+ name: afrimmlu_direct_xho
42
+ type: text-classification
43
+ split: test
44
+ metrics:
45
+ - name: Accuracy
46
+ type: accuracy
47
+ value: 0.29
48
+ - task:
49
+ type: text-generation
50
+ name: African Language Evaluation
51
+ dataset:
52
+ name: afrimmlu_direct_zul
53
+ type: text-classification
54
+ split: test
55
+ metrics:
56
+ - name: Accuracy
57
+ type: accuracy
58
+ value: 0.29
59
+ - task:
60
+ type: text-generation
61
+ name: African Language Evaluation
62
+ dataset:
63
+ name: afrixnli_en_direct_xho
64
+ type: text-classification
65
+ split: test
66
+ metrics:
67
+ - name: Accuracy
68
+ type: accuracy
69
+ value: 0.44
70
+ - task:
71
+ type: text-generation
72
+ name: African Language Evaluation
73
+ dataset:
74
+ name: afrixnli_en_direct_zul
75
+ type: text-classification
76
+ split: test
77
+ metrics:
78
+ - name: Accuracy
79
+ type: accuracy
80
+ value: 0.43
81
 
82
+ model_description: |
83
+ This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) on the generator dataset.
84
+ [Alpaca Cleaned](https://huggingface.co/datasets/yahma/alpaca-cleaned) translated into Xhose, Zulu, Tswana, Northern Sotho and Afrikaans using machine translation.
 
 
85
 
86
+ The model could only be evaluated in Xhosa and Zulu due to Iroko language availability. Its aim is to show cross-lingual transfer can be achieved at a low cost. Translation cost roughly $370 per language and training cost roughly $15 using an Akash Compute Network GPU.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
87
 
88
+ training_details:
89
+ loss: 1.0571
90
+ hyperparameters:
91
+ learning_rate: 0.0002
92
+ train_batch_size: 4
93
+ eval_batch_size: 8
94
+ seed: 42
95
+ distributed_type: multi-GPU
96
+ gradient_accumulation_steps: 2
97
+ total_train_batch_size: 8
98
+ optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
99
+ lr_scheduler_type: cosine
100
+ lr_scheduler_warmup_ratio: 0.1
101
+ num_epochs: 1
102
 
103
+ training_results:
104
+ final_loss: 1.0959
105
+ epochs: 0.9999
106
+ steps: 5596
107
+ validation_loss: 1.0571
 
 
 
 
 
 
 
 
 
108
 
109
+ framework_versions:
110
+ peft: 0.12.0
111
+ transformers: 4.44.2
112
+ pytorch: 2.4.1+cu121
113
+ datasets: 3.0.0
114
+ tokenizers: 0.19.1
115
+ ---