asiansoul commited on
Commit
c39bd0c
โ€ข
1 Parent(s): 0120356

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +112 -5
README.md CHANGED
@@ -1,5 +1,112 @@
1
- ---
2
- license: other
3
- license_name: other
4
- license_link: LICENSE
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - beomi/Llama-3-KoEn-8B-Instruct-preview
4
+ - Danielbrdz/Barcenas-Llama3-8b-ORPO
5
+ - maum-ai/Llama-3-MAAL-8B-Instruct-v0.1
6
+ - rombodawg/Llama-3-8B-Instruct-Coder
7
+ - NousResearch/Meta-Llama-3-8B-Instruct
8
+ - rombodawg/Llama-3-8B-Base-Coder-v3.5-10k
9
+ - cognitivecomputations/dolphin-2.9-llama3-8b
10
+ - asiansoul/Llama-3-Open-Ko-Linear-8B
11
+ - NousResearch/Meta-Llama-3-8B
12
+ - aaditya/Llama3-OpenBioLLM-8B
13
+ library_name: transformers
14
+ tags:
15
+ - mergekit
16
+ - merge
17
+
18
+ ---
19
+ # Joah-Llama-3-KoEn-8B-Coder-v1
20
+
21
+ ์˜ค๋Š˜ ๋ถ€ํ„ฐ ์„œ๋กœ์—๊ฒŒ ๋น›์ด ๋˜์–ด ์ค„ ์—ฌ๋Ÿฌ๋ถ„์˜ Merge Model
22
+
23
+ "์ข‹์•„(Joah)" by AsianSoul
24
+
25
+ ## Merge Details
26
+
27
+
28
+ The performance of this merge model doesn't seem to be bad though.-> Just opinion
29
+
30
+ This may not be a model that satisfies you. But if we continue to overcome our shortcomings,
31
+
32
+ Won't we someday find the answer we want?
33
+
34
+ ### Merge Method
35
+
36
+ This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [NousResearch/Meta-Llama-3-8B](https://huggingface.co/NousResearch/Meta-Llama-3-8B) as a base.
37
+
38
+ ### Models Merged
39
+
40
+ The following models were included in the merge:
41
+ * [beomi/Llama-3-KoEn-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-KoEn-8B-Instruct-preview)
42
+ * [Danielbrdz/Barcenas-Llama3-8b-ORPO](https://huggingface.co/Danielbrdz/Barcenas-Llama3-8b-ORPO)
43
+ * [maum-ai/Llama-3-MAAL-8B-Instruct-v0.1](https://huggingface.co/maum-ai/Llama-3-MAAL-8B-Instruct-v0.1)
44
+ * [rombodawg/Llama-3-8B-Instruct-Coder](https://huggingface.co/rombodawg/Llama-3-8B-Instruct-Coder)
45
+ * [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct)
46
+ * [rombodawg/Llama-3-8B-Base-Coder-v3.5-10k](https://huggingface.co/rombodawg/Llama-3-8B-Base-Coder-v3.5-10k)
47
+ * [cognitivecomputations/dolphin-2.9-llama3-8b](https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-8b)
48
+ * [asiansoul/Llama-3-Open-Ko-Linear-8B](https://huggingface.co/asiansoul/Llama-3-Open-Ko-Linear-8B)
49
+ * [aaditya/Llama3-OpenBioLLM-8B](https://huggingface.co/aaditya/Llama3-OpenBioLLM-8B)
50
+
51
+ ### Configuration
52
+
53
+ The following YAML configuration was used to produce this model:
54
+
55
+ ```yaml
56
+ models:
57
+ - model: NousResearch/Meta-Llama-3-8B
58
+ # Base model providing a general foundation without specific parameters
59
+
60
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
61
+ parameters:
62
+ density: 0.60
63
+ weight: 0.25
64
+
65
+ - model: beomi/Llama-3-KoEn-8B-Instruct-preview
66
+ parameters:
67
+ density: 0.55
68
+ weight: 0.15
69
+
70
+ - model: asiansoul/Llama-3-Open-Ko-Linear-8B
71
+ parameters:
72
+ density: 0.55
73
+ weight: 0.2
74
+
75
+ - model: maum-ai/Llama-3-MAAL-8B-Instruct-v0.1
76
+ parameters:
77
+ density: 0.55
78
+ weight: 0.1
79
+
80
+ - model: rombodawg/Llama-3-8B-Instruct-Coder
81
+ parameters:
82
+ density: 0.55
83
+ weight: 0.1
84
+
85
+ - model: rombodawg/Llama-3-8B-Base-Coder-v3.5-10k
86
+ parameters:
87
+ density: 0.55
88
+ weight: 0.1
89
+
90
+ - model: cognitivecomputations/dolphin-2.9-llama3-8b
91
+ parameters:
92
+ density: 0.55
93
+ weight: 0.05
94
+
95
+ - model: Danielbrdz/Barcenas-Llama3-8b-ORPO
96
+ parameters:
97
+ density: 0.55
98
+ weight: 0.05
99
+
100
+ - model: aaditya/Llama3-OpenBioLLM-8B
101
+ parameters:
102
+ density: 0.55
103
+ weight: 0.1
104
+
105
+ merge_method: dare_ties
106
+ base_model: NousResearch/Meta-Llama-3-8B
107
+ parameters:
108
+ int8_mask: true
109
+ dtype: bfloat16
110
+
111
+
112
+ ```