Update README.md
Browse files
README.md
CHANGED
@@ -9,6 +9,42 @@ pipeline_tag: text-generation
|
|
9 |
|
10 |
Model Card is still WIP!
|
11 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
## Usage
|
13 |
|
14 |
```py
|
|
|
9 |
|
10 |
Model Card is still WIP!
|
11 |
|
12 |
+
|
13 |
+
## Base model info
|
14 |
+
|
15 |
+
Mamba is a new state space model architecture showing promising performance on information-dense data such as language modeling, where previous subquadratic models fall short of Transformers.
|
16 |
+
It is based on the line of progress on [structured state space models](https://github.com/state-spaces/s4),
|
17 |
+
with an efficient hardware-aware design and implementation in the spirit of [FlashAttention](https://github.com/Dao-AILab/flash-attention).
|
18 |
+
|
19 |
+
## Dataset info
|
20 |
+
|
21 |
+
_Look Ma, an instruction dataset that wasn't generated by GPTs!_
|
22 |
+
|
23 |
+
### Dataset Description
|
24 |
+
|
25 |
+
- **Repository:** https://github.com/huggingface/alignment-handbook
|
26 |
+
- **Paper:**
|
27 |
+
- **Leaderboard:** https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
|
28 |
+
- **Point of Contact:** Lewis Tunstall
|
29 |
+
|
30 |
+
#### Dataset Summary
|
31 |
+
|
32 |
+
No Robots is a high-quality dataset of 10,000 instructions and demonstrations created by skilled human annotators. This data can be used for supervised fine-tuning (SFT) to make language models follow instructions better. No Robots was modelled after the instruction dataset described in OpenAI's [InstructGPT paper](https://huggingface.co/papers/2203.02155), and is comprised mostly of single-turn instructions across the following categories:
|
33 |
+
|
34 |
+
| Category | Count |
|
35 |
+
|:-----------|--------:|
|
36 |
+
| Generation | 4560 |
|
37 |
+
| Open QA | 1240 |
|
38 |
+
| Brainstorm | 1120 |
|
39 |
+
| Chat | 850 |
|
40 |
+
| Rewrite | 660 |
|
41 |
+
| Summarize | 420 |
|
42 |
+
| Coding | 350 |
|
43 |
+
| Classify | 350 |
|
44 |
+
| Closed QA | 260 |
|
45 |
+
| Extract | 190 |
|
46 |
+
|
47 |
+
|
48 |
## Usage
|
49 |
|
50 |
```py
|