aashish1904 commited on
Commit
efd140b
·
verified ·
1 Parent(s): 0e78bbd

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +239 -0
README.md ADDED
@@ -0,0 +1,239 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ license: apache-2.0
5
+ library_name: transformers
6
+ tags:
7
+ - mergekit
8
+ - merge
9
+ base_model:
10
+ - mistralai/Mistral-7B-Instruct-v0.2
11
+ model-index:
12
+ - name: bigstral-12b-32k
13
+ results:
14
+ - task:
15
+ type: text-generation
16
+ name: Text Generation
17
+ dataset:
18
+ name: IFEval (0-Shot)
19
+ type: HuggingFaceH4/ifeval
20
+ args:
21
+ num_few_shot: 0
22
+ metrics:
23
+ - type: inst_level_strict_acc and prompt_level_strict_acc
24
+ value: 41.94
25
+ name: strict accuracy
26
+ source:
27
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=abacusai/bigstral-12b-32k
28
+ name: Open LLM Leaderboard
29
+ - task:
30
+ type: text-generation
31
+ name: Text Generation
32
+ dataset:
33
+ name: BBH (3-Shot)
34
+ type: BBH
35
+ args:
36
+ num_few_shot: 3
37
+ metrics:
38
+ - type: acc_norm
39
+ value: 25.56
40
+ name: normalized accuracy
41
+ source:
42
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=abacusai/bigstral-12b-32k
43
+ name: Open LLM Leaderboard
44
+ - task:
45
+ type: text-generation
46
+ name: Text Generation
47
+ dataset:
48
+ name: MATH Lvl 5 (4-Shot)
49
+ type: hendrycks/competition_math
50
+ args:
51
+ num_few_shot: 4
52
+ metrics:
53
+ - type: exact_match
54
+ value: 0.98
55
+ name: exact match
56
+ source:
57
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=abacusai/bigstral-12b-32k
58
+ name: Open LLM Leaderboard
59
+ - task:
60
+ type: text-generation
61
+ name: Text Generation
62
+ dataset:
63
+ name: GPQA (0-shot)
64
+ type: Idavidrein/gpqa
65
+ args:
66
+ num_few_shot: 0
67
+ metrics:
68
+ - type: acc_norm
69
+ value: 5.7
70
+ name: acc_norm
71
+ source:
72
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=abacusai/bigstral-12b-32k
73
+ name: Open LLM Leaderboard
74
+ - task:
75
+ type: text-generation
76
+ name: Text Generation
77
+ dataset:
78
+ name: MuSR (0-shot)
79
+ type: TAUR-Lab/MuSR
80
+ args:
81
+ num_few_shot: 0
82
+ metrics:
83
+ - type: acc_norm
84
+ value: 15.86
85
+ name: acc_norm
86
+ source:
87
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=abacusai/bigstral-12b-32k
88
+ name: Open LLM Leaderboard
89
+ - task:
90
+ type: text-generation
91
+ name: Text Generation
92
+ dataset:
93
+ name: MMLU-PRO (5-shot)
94
+ type: TIGER-Lab/MMLU-Pro
95
+ config: main
96
+ split: test
97
+ args:
98
+ num_few_shot: 5
99
+ metrics:
100
+ - type: acc
101
+ value: 18.24
102
+ name: accuracy
103
+ source:
104
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=abacusai/bigstral-12b-32k
105
+ name: Open LLM Leaderboard
106
+
107
+ ---
108
+
109
+ [![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
110
+
111
+
112
+ # QuantFactory/bigstral-12b-32k-GGUF
113
+ This is quantized version of [abacusai/bigstral-12b-32k](https://huggingface.co/abacusai/bigstral-12b-32k) created using llama.cpp
114
+
115
+ # Original Model Card
116
+
117
+ # bigstral-12b-32k
118
+
119
+ `ollama run ehartford/bigstral`
120
+
121
+ I reached for SOLAR but realized that its only 4k... Decided to make my own 32k version.
122
+
123
+ This is [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) with layers interleaved to create a larger 12b model.
124
+
125
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
126
+
127
+ ## Prompt Format
128
+
129
+ ```
130
+ <s>[INST] What is your favourite condiment? [/INST]
131
+ Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!</s>
132
+ [INST] Do you have dangerously spicy mayonnaise recipes? [/INST]
133
+ ```
134
+
135
+ ## Example Output
136
+
137
+
138
+ give me some code that is very simple in javascript but would be very difficult in ANSI-C
139
+
140
+ > In JavaScript, functions are first-class values, which means you can assign them to variables, pass them as arguments to other functions, and even nest them inside other functions. This leads to concise and expressive code.
141
+ >
142
+ > Here's an example of a simple recursive function that calculates the factorial of a given number in JavaScript:
143
+ >
144
+ > ```javascript
145
+ > const factorial = n => n === 0 ? 1 : n * factorial(n - 1);
146
+ >
147
+ > console.log(factorial(5)); // Output: 120
148
+ >
149
+ > ```
150
+ >
151
+ > Now, let's try to write an equivalent code snippet in ANSI-C with similar functionality:
152
+ >
153
+ > ```c
154
+ > #include <stdio.h>
155
+ > #define MAX_RECURSION_DEPTH 1000
156
+ > int factorial(int n) {
157
+ > if (n == 0)
158
+ > return 1;
159
+ > else {
160
+ > int result;
161
+ > result = factorial(--n);
162
+ > if (++depth > MAX_RECURSION_DEPTH)
163
+ > while(1){} // Recursion stack overflow
164
+ > return n * result;
165
+ > }
166
+ > }
167
+ > int main() {
168
+ > int i, result;
169
+ > int n = 5;
170
+ > int depth = 0;
171
+ >
172
+ > result = factorial(n);
173
+ > printf("Factorial of %d is: %d\n", n, result);
174
+ > return 0;
175
+ > }
176
+ > ```
177
+ >
178
+ > As you can see, the ANSI-C code becomes more verbose and has several disadvantages compared to JavaScript:
179
+ >
180
+ > 1. Need to define a maximum recursion depth using a constant or an array index, which can lead to hardcoded limits and potential stack overflow issues.
181
+ > 2. The use of preprocessor directives (#define) for defining functions and macros can make the code less readable and harder to understand.
182
+ > 3. The lack of closures in ANSI-C requires using static variables or passing arguments as arguments to nested functions, making the code more verbose and less concise.
183
+ > ```
184
+
185
+ ## Merge Details
186
+ ### Merge Method
187
+
188
+ This model was merged using the passthrough merge method.
189
+
190
+ ### Models Merged
191
+
192
+ The following models were included in the merge:
193
+ * [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
194
+
195
+ ### Configuration
196
+
197
+ The following YAML configuration was used to produce this model:
198
+
199
+ ```yaml
200
+ dtype: float16
201
+ merge_method: passthrough
202
+ slices:
203
+ - sources:
204
+ - layer_range: [0, 8]
205
+ model: mistralai/Mistral-7B-Instruct-v0.2
206
+ - sources:
207
+ - layer_range: [4, 12]
208
+ model: mistralai/Mistral-7B-Instruct-v0.2
209
+ - sources:
210
+ - layer_range: [8, 16]
211
+ model: mistralai/Mistral-7B-Instruct-v0.2
212
+ - sources:
213
+ - layer_range: [12, 20]
214
+ model: mistralai/Mistral-7B-Instruct-v0.2
215
+ - sources:
216
+ - layer_range: [16, 24]
217
+ model: mistralai/Mistral-7B-Instruct-v0.2
218
+ - sources:
219
+ - layer_range: [20, 28]
220
+ model: mistralai/Mistral-7B-Instruct-v0.2
221
+ - sources:
222
+ - layer_range: [24, 32]
223
+ model: mistralai/Mistral-7B-Instruct-v0.2
224
+
225
+ ```
226
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
227
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_abacusai__bigstral-12b-32k)
228
+
229
+ | Metric |Value|
230
+ |-------------------|----:|
231
+ |Avg. |18.05|
232
+ |IFEval (0-Shot) |41.94|
233
+ |BBH (3-Shot) |25.56|
234
+ |MATH Lvl 5 (4-Shot)| 0.98|
235
+ |GPQA (0-shot) | 5.70|
236
+ |MuSR (0-shot) |15.86|
237
+ |MMLU-PRO (5-shot) |18.24|
238
+
239
+