brittlewis12
commited on
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,106 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
base_model: stabilityai/stablelm-zephyr-3b
|
3 |
+
datasets:
|
4 |
+
- HuggingFaceH4/ultrachat_200k
|
5 |
+
- HuggingFaceH4/ultrafeedback_binarized
|
6 |
+
- meta-math/MetaMathQA
|
7 |
+
- WizardLM/WizardLM_evol_instruct_V2_196k
|
8 |
+
- Intel/orca_dpo_pairs
|
9 |
+
license: other
|
10 |
+
license_link: https://huggingface.co/stabilityai/stablelm-zephyr-3b/blob/main/LICENSE
|
11 |
+
language:
|
12 |
+
- en
|
13 |
+
model_creator: stabilityai
|
14 |
+
model_name: stablelm-zephyr-3b
|
15 |
+
model_type: stablelm_epoch
|
16 |
+
inference: false
|
17 |
+
tags:
|
18 |
+
- causal-lm
|
19 |
+
- stablelm_epoch
|
20 |
+
pipeline_tag: text-generation
|
21 |
+
prompt_template: |
|
22 |
+
<|system|>
|
23 |
+
{{system_message}}<|endoftext|>
|
24 |
+
<|user|>
|
25 |
+
{{prompt}}<|endoftext|>
|
26 |
+
<|assistant|>
|
27 |
+
|
28 |
+
quantized_by: brittlewis12
|
29 |
+
---
|
30 |
+
|
31 |
+
# StableLM Zephyr 3B GGUF
|
32 |
+
|
33 |
+
Original model: [StableLM Zephyr 3B](https://huggingface.co/stabilityai/stablelm-zephyr-3b)
|
34 |
+
Model creator: [Stability AI](https://huggingface.co/stabilityai)
|
35 |
+
|
36 |
+
This repo contains GGUF format model files for Stability AI’s StableLM Zephyr 3B.
|
37 |
+
|
38 |
+
> StableLM Zephyr 3B is a 3 billion parameter instruction tuned inspired by [HugginFaceH4's Zephyr 7B](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) training pipeline this model was trained on a mix of publicly available datasets, synthetic datasets using [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290), evaluation for this model based on [MT Bench](https://tatsu-lab.github.io/alpaca_eval/) and [Alpaca Benchmark](https://tatsu-lab.github.io/alpaca_eval/).
|
39 |
+
|
40 |
+
|
41 |
+
### What is GGUF?
|
42 |
+
|
43 |
+
GGUF is a file format for representing AI models. It is the third version of the format, introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.
|
44 |
+
Converted using llama.cpp b1960 ([26d6076](https://github.com/ggerganov/llama.cpp/commits/26d607608d794efa56df3bdb6043a2f94c1d632c))
|
45 |
+
|
46 |
+
### Prompt template: Zephyr
|
47 |
+
|
48 |
+
```
|
49 |
+
<|system|>
|
50 |
+
{{system_message}}<|endoftext|>
|
51 |
+
<|user|>
|
52 |
+
{{prompt}}<|endoftext|>
|
53 |
+
<|assistant|>
|
54 |
+
```
|
55 |
+
|
56 |
+
---
|
57 |
+
|
58 |
+
## Download & run with [cnvrs](https://twitter.com/cnvrsai) on iPhone, iPad, and Mac!
|
59 |
+
|
60 |
+
![cnvrs.ai](https://pbs.twimg.com/profile_images/1744049151241797632/0mIP-P9e_400x400.jpg)
|
61 |
+
|
62 |
+
[cnvrs](https://testflight.apple.com/join/sFWReS7K) is the best app for private, local AI on your device:
|
63 |
+
- create & save **Characters** with custom system prompts & temperature settings
|
64 |
+
- download and experiment with any **GGUF model** you can [find on HuggingFace](https://huggingface.co/models?library=gguf)!
|
65 |
+
- make it your own with custom **Theme colors**
|
66 |
+
- powered by Metal ⚡️ & [Llama.cpp](https://github.com/ggerganov/llama.cpp), with **haptics** during response streaming!
|
67 |
+
- **try it out** yourself today, on [Testflight](https://testflight.apple.com/join/sFWReS7K)!
|
68 |
+
- follow [cnvrs on twitter](https://twitter.com/cnvrsai) to stay up to date
|
69 |
+
|
70 |
+
---
|
71 |
+
|
72 |
+
## Original Model Evaluations:
|
73 |
+
|
74 |
+
![mt-bench](https://cdn-uploads.huggingface.co/production/uploads/6310474ca119d49bc1eb0d80/8WIZS6dAlu5kSH-382pMl.png)
|
75 |
+
|
76 |
+
| Model | Size | Alignment | MT-Bench (score) | AlpacaEval (win rate %) |
|
77 |
+
|-------------|-----|----|---------------|--------------|
|
78 |
+
| **StableLM Zephyr 3B** 🪁 | 3B | DPO | 6.64 | 76.00 |
|
79 |
+
| StableLM Zephyr (SFT only) | 3B | SFT | 6.04 | 71.15 |
|
80 |
+
| Capybara v1.9 | 3B | dSFT | 5.94 | - |
|
81 |
+
| MPT-Chat | 7B |dSFT |5.42| -|
|
82 |
+
| Xwin-LM v0.1 | 7B| dPPO| 6.19| 87.83|
|
83 |
+
| Mistral-Instruct v0.1 | 7B| - | 6.84 |-|
|
84 |
+
| Zephyr-7b-α |7B| dDPO| 6.88| -|
|
85 |
+
| Zephyr-7b-β| 7B | dDPO | 7.34 | 90.60 |
|
86 |
+
| Falcon-Instruct | 40B |dSFT |5.17 |45.71|
|
87 |
+
| Guanaco | 65B | SFT |6.41| 71.80|
|
88 |
+
| Llama2-Chat | 70B |RLHF |6.86| 92.66|
|
89 |
+
| Vicuna v1.3 | 33B |dSFT |7.12 |88.99|
|
90 |
+
| WizardLM v1.0 | 70B |dSFT |7.71 |-|
|
91 |
+
| Xwin-LM v0.1 | 70B |dPPO |- |95.57|
|
92 |
+
| GPT-3.5-turbo | - |RLHF |7.94 |89.37|
|
93 |
+
| Claude 2 | - |RLHF |8.06| 91.36|
|
94 |
+
| GPT-4 | -| RLHF |8.99| 95.28|
|
95 |
+
|
96 |
+
| Task | Value |
|
97 |
+
|-----------------------|---------------------------|
|
98 |
+
| ARC (25-shot) | 47.0 |
|
99 |
+
| HellaSwag (10-shot) | 74.2 |
|
100 |
+
| MMLU (5-shot) | 46.3 |
|
101 |
+
| TruthfulQA (0-shot) | 46.5 |
|
102 |
+
| Winogrande (5-shot) | 65.5 |
|
103 |
+
| GSM8K (5-shot) | 42.3 |
|
104 |
+
| BigBench (Avg) | 35.26 |
|
105 |
+
| AGI Benchmark (Avg) | 33.23 |
|
106 |
+
|