TheBloke commited on
Commit
a7ae99a
·
1 Parent(s): f6b47bd

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -8
README.md CHANGED
@@ -10,6 +10,18 @@ model_creator: AIDC-ai-business
10
  model_name: Marcoroni 13B
11
  model_type: llama
12
  pipeline_tag: text-generation
 
 
 
 
 
 
 
 
 
 
 
 
13
  quantized_by: TheBloke
14
  ---
15
 
@@ -61,17 +73,23 @@ Here is an incomplate list of clients and libraries that are known to support GG
61
  <!-- repositories-available start -->
62
  ## Repositories available
63
 
 
64
  * [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/Marcoroni-13B-GPTQ)
65
  * [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference](https://huggingface.co/TheBloke/Marcoroni-13B-GGUF)
66
  * [AIDC-ai-business's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/AIDC-ai-business/Marcoroni-13B)
67
  <!-- repositories-available end -->
68
 
69
  <!-- prompt-template start -->
70
- ## Prompt template: Unknown
71
 
72
  ```
 
 
 
73
  {prompt}
74
 
 
 
75
  ```
76
 
77
  <!-- prompt-template end -->
@@ -193,7 +211,7 @@ Windows CLI users: Use `set HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1` before running
193
  Make sure you are using `llama.cpp` from commit [d0cee0d36d5be95a0d9088b674dbb27354107221](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
194
 
195
  ```shell
196
- ./main -ngl 32 -m marcoroni-13b.q4_K_M.gguf --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "{prompt}"
197
  ```
198
 
199
  Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
@@ -306,15 +324,14 @@ Fine-tuned from Llama2-13B,we use Orca-style data and other open source data f
306
  ### Response:
307
  ```
308
 
309
-
310
  # Evulation Results ([Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard))
311
 
312
  | Metric | Value |
313
  |-----------------------|-------|
314
- | Avg. | 65.23 |
315
- | ARC (25-shot) | 63.31 |
316
- | HellaSwag (10-shot) | 83.04 |
317
- | MMLU (5-shot) | 58.78 |
318
- | TruthfulQA (0-shot) | 55.79 |
319
 
320
  <!-- original-model-card end -->
 
10
  model_name: Marcoroni 13B
11
  model_type: llama
12
  pipeline_tag: text-generation
13
+ prompt_template: 'Below is an instruction that describes a task. Write a response
14
+ that appropriately completes the request.
15
+
16
+
17
+ ### Instruction:
18
+
19
+ {prompt}
20
+
21
+
22
+ ### Response:
23
+
24
+ '
25
  quantized_by: TheBloke
26
  ---
27
 
 
73
  <!-- repositories-available start -->
74
  ## Repositories available
75
 
76
+ * [AWQ model(s) for GPU inference.](https://huggingface.co/TheBloke/Marcoroni-13B-AWQ)
77
  * [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/Marcoroni-13B-GPTQ)
78
  * [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference](https://huggingface.co/TheBloke/Marcoroni-13B-GGUF)
79
  * [AIDC-ai-business's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/AIDC-ai-business/Marcoroni-13B)
80
  <!-- repositories-available end -->
81
 
82
  <!-- prompt-template start -->
83
+ ## Prompt template: Alpaca
84
 
85
  ```
86
+ Below is an instruction that describes a task. Write a response that appropriately completes the request.
87
+
88
+ ### Instruction:
89
  {prompt}
90
 
91
+ ### Response:
92
+
93
  ```
94
 
95
  <!-- prompt-template end -->
 
211
  Make sure you are using `llama.cpp` from commit [d0cee0d36d5be95a0d9088b674dbb27354107221](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
212
 
213
  ```shell
214
+ ./main -ngl 32 -m marcoroni-13b.q4_K_M.gguf --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{prompt}\n\n### Response:"
215
  ```
216
 
217
  Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
 
324
  ### Response:
325
  ```
326
 
 
327
  # Evulation Results ([Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard))
328
 
329
  | Metric | Value |
330
  |-----------------------|-------|
331
+ | Avg. | 65.76 |
332
+ | ARC (25-shot) | 62.46 |
333
+ | HellaSwag (10-shot) | 83.27 |
334
+ | MMLU (5-shot) | 59.63 |
335
+ | TruthfulQA (0-shot) | 57.7 |
336
 
337
  <!-- original-model-card end -->