ymcki commited on
Commit
b72be0a
·
1 Parent(s): cc8a749

merge method;

Browse files
Files changed (1) hide show
  1. README.md +111 -12
README.md CHANGED
@@ -1,26 +1,84 @@
1
  ---
2
- base_model:
3
- - google/gemma-2-2b
 
 
 
4
  library_name: transformers
 
 
 
5
  tags:
6
- - mergekit
7
- - merge
8
-
 
 
 
 
9
  ---
10
- # gemma-2-2b-ORPO-jpn-it-abliterated-18-merge
11
 
12
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
  ## Merge Details
15
  ### Merge Method
16
 
 
 
17
  This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [google/gemma-2-2b](https://huggingface.co/google/gemma-2-2b) as a base.
18
 
19
  ### Models Merged
20
 
21
  The following models were included in the merge:
22
- * /home/user/gemma-2-2b-ORPO-jpn-it-abliterated-18
23
- * /home/user/gemma-2-2b-jpn-it-abliterated-18
24
 
25
  ### Configuration
26
 
@@ -28,12 +86,12 @@ The following YAML configuration was used to produce this model:
28
 
29
  ```yaml
30
  models:
31
- - model: /home/user/gemma-2-2b-ORPO-jpn-it-abliterated-18
32
  dtype: bfloat16
33
  parameters:
34
  density: 1.0
35
  weight: 1.0
36
- - model: /home/user/gemma-2-2b-jpn-it-abliterated-18
37
  dtype: bfloat16
38
  parameters:
39
  density: 1.0
@@ -46,6 +104,47 @@ parameters:
46
  normalize: true
47
  int8_mask: true
48
  dtype: bfloat16
49
- tokenizer_source: /home/user/gemma-2-2b-ORPO-jpn-it-abliterated-18
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50
 
51
  ```
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model: google/gemma-2-2b-jpn-it
3
+ language:
4
+ - multilingual
5
+ datasets:
6
+ - mlabonne/orpo-dpo-mix-40k
7
  library_name: transformers
8
+ license: gemma
9
+ license_link: https://ai.google.dev/gemma/terms
10
+ pipeline_tag: text-generation
11
  tags:
12
+ - nlp
13
+ - code
14
+ quantized_by: ymcki
15
+ widget:
16
+ - messages:
17
+ - role: user
18
+ content: Can you provide ways to eat combinations of bananas and dragonfruits?
19
  ---
 
20
 
21
+ Original model: https://huggingface.co/google/gemma-2-2b-jpn-it
22
+
23
+ ## Prompt format
24
+
25
+ ```
26
+ <start_of_turn>user
27
+ {prompt}<end_of_turn>
28
+ <start_of_turn>model
29
+ <end_of_turn>
30
+ <start_of_turn>model
31
+
32
+ ```
33
+
34
+ Note that this model does not support a System prompt.
35
+
36
+ Since [gemma-2-2b-jpn-it-ablitered-18](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-abliterated-18) is slightly brain damaged compare to the original [gemma-2-2b-jpn-it](https://huggingface.co/google/gemma-2-2b-jpn-it). I decided to try ORPO fine tuning to see if it can be headled.
37
+
38
+ Using the [gemma-2-2b base model](https://huggingface.co/google/gemma-2-2b), I employed the ORPO method described by [mlabonne](https://towardsdatascience.com/fine-tune-llama-3-with-orpo-56cfab2f9ada) but the input model was read into VRAM by [unsloth](https://github.com/unslothai/unsloth) to allow using the full 40k dataset to run on a single 3090.
39
+
40
+ Five epoches was run. Smallest eval_loss was achieve at epoch 4.96.
41
+ Checkpoint at epoch 4.96 is used to obtain a model adapter and
42
+ applied it to [gemma-2-2b-jpn-it-ablitered-18](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-abliterated-18) to obtain [gemma-2-2b-ORPO-jpn-it-ablitered-18](https://huggingface.co/ymcki/gemma-2-2b-ORPO-jpn-it-abliterated-18).
43
+
44
+ | Epoch | loss | eval_loss | eval_logps/rejected | eval_logps/chosen |
45
+ | ----- | ---- | --------- | ------------------- | ----------------- |
46
+ | 1.00 | 1.2868 | 1.0689 | -1.0857 | -0.7500 |
47
+ | 2.00 | 0.9663 | 1.0288 | -1.1321 | -0.7289 |
48
+ | 3.00 | 1.2255 | 1.0297 | -1.1840 | -0.7272 |
49
+ | 4.00 | 1.5293 | 1.0166 | -1.2004 | -0.7200 |
50
+ | 4.96 | 1.2893 | 1.0077 | -1.1754 | -0.7106 |
51
+ | 5.00 | 1.3458 | 1.0078 | -1.1730 | -0.7105 |
52
+
53
+ Then I followed Rombodawg's [suggestion](https://www.reddit.com/r/LocalLLaMA/comments/1fyx27y/im_pretty_happy_with_how_my_method_worked_out/) to merge [gemma-2-2b](https://huggingface.co/google/gemma-2-2b), [gemma-2-2b-ORPO-jpn-it-ablitered-18](https://huggingface.co/ymcki/gemma-2-2b-ORPO-jpn-it-abliterated-18) and [gemma-2-2b-jpn-it-ablitered-18](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-abliterated-18) to obtain this model.
54
+
55
+ This model is uploaded here to be evaluated by the Open LLM Leaderboard. Further ORPO fine tuning is currently underway to see if it can regain its sanity. You can play with this model first or wait until I am done with the fine tuning.
56
+
57
+ ## Benchmark (100.0*raw scores only)
58
+
59
+ Click on the model name go to the raw score json generated by Open LLM Leaderboard.
60
+
61
+ | Model | Average | IFEval | BHH | Math Lv5 | GPQA | MUSR | MMLU-PRO |
62
+ | ----- | ------- | ------ | ----|--------- | ---- | ---- | -------- |
63
+ | [gemma-2-2b-jpn-it](https://huggingface.co/datasets/open-llm-leaderboard/results/blob/main/google/gemma-2-2b-jpn-it/results_2024-10-15T15-21-39.173019.json) | 30.82 | 54.11 | 41.43 | 0.0 | 27.52 | 37.17 | 24.67 |
64
+ | gemma-2-2b-ORPO-jpn-it-abliterated-18 (5 epoches) | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
65
+ | gemma-2-2b-ORPO-jpn-it-abliterated-18-merge (5 epoches) | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
66
+ | [gemma-2-2b-jpn-it-abliterated-17](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-17/results_2024-10-18T15-18-46.821674.json) | 30.29 | 52.65 | 40.46 | 0.0 | 27.18 | 36.90 | 24.55 |
67
+ | [gemma-2-2b-jpn-it-abliterated-18](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-18/results_2024-10-18T15-41-42.399571.json) | 30.61 | 53.02 | 40.96 | 0.0 | 27.35 | 37.30 | 25.05 |
68
+ | [gemma-2-2b-jpn-it-abliterated-24](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-24/results_2024-10-25T16-29-46.542899.json) | 30.61 | 51.37 | 40.77 | 0.0 | 27.77 | 39.02 | 24.73 |
69
 
70
  ## Merge Details
71
  ### Merge Method
72
 
73
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
74
+
75
  This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [google/gemma-2-2b](https://huggingface.co/google/gemma-2-2b) as a base.
76
 
77
  ### Models Merged
78
 
79
  The following models were included in the merge:
80
+ * ymcki/gemma-2-2b-ORPO-jpn-it-abliterated-18
81
+ * ymcki/gemma-2-2b-jpn-it-abliterated-18
82
 
83
  ### Configuration
84
 
 
86
 
87
  ```yaml
88
  models:
89
+ - model: ymcki/gemma-2-2b-ORPO-jpn-it-abliterated-18
90
  dtype: bfloat16
91
  parameters:
92
  density: 1.0
93
  weight: 1.0
94
+ - model: ymcki/gemma-2-2b-jpn-it-abliterated-18
95
  dtype: bfloat16
96
  parameters:
97
  density: 1.0
 
104
  normalize: true
105
  int8_mask: true
106
  dtype: bfloat16
107
+ tokenizer_source: ymcki/gemma-2-2b-ORPO-jpn-it-abliterated-18
108
+ ```
109
+
110
+ ## How to run this model
111
+
112
+ ```py
113
+ from transformers import AutoTokenizer, AutoModelForCausalLM
114
+ import transformers
115
+ import torch
116
+
117
+ model_id = "gemma-2-2b-ORPO-jpn-it-abliterated-18-merge"
118
+ dtype = torch.bfloat16
119
+
120
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
121
+ model = AutoModelForCausalLM.from_pretrained(
122
+ model_id,
123
+ device_map="cuda",
124
+ torch_dtype=dtype,)
125
+
126
+ chat = [
127
+ { "role": "user", "content": "Write a hello world program" },
128
+ ]
129
+ prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
130
+ ```
131
+
132
+ ## Downloading using huggingface-cli
133
+
134
+ First, make sure you have hugginface-cli installed:
135
+
136
+ ```
137
+ pip install -U "huggingface_hub[cli]"
138
+ ```
139
+
140
+ Then, you can target the specific file you want:
141
 
142
  ```
143
+ huggingface-cli download ymcki/gemma-2-2b-ORPO-jpn-it-abliterated-18-merge --include "*" --local-dir ./
144
+ ```
145
+
146
+ ## Credits
147
+
148
+ Thank you mlabonne for describing the ORPO fine tuning method.
149
+
150
+ Thank you FullOf_Bad_Ideas from LocalLlama for the suggestion of using unsloth to save VRAM.