clibrain
/

mamba-2.8b-chat-no_robots

Text Generation

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

mrm8488 commited on Dec 11, 2023

Commit

606fa26

·

1 Parent(s): ce7cf93

Update README.md

Files changed (1) hide show

README.md +36 -0

README.md CHANGED Viewed

@@ -9,6 +9,42 @@ pipeline_tag: text-generation
 Model Card is still WIP!
 ## Usage
 ```py

 Model Card is still WIP!
+## Base model info
+Mamba is a new state space model architecture showing promising performance on information-dense data such as language modeling, where previous subquadratic models fall short of Transformers.
+It is based on the line of progress on [structured state space models](https://github.com/state-spaces/s4),
+with an efficient hardware-aware design and implementation in the spirit of [FlashAttention](https://github.com/Dao-AILab/flash-attention).
+## Dataset info
+_Look Ma, an instruction dataset that wasn't generated by GPTs!_
+### Dataset Description
+- **Repository:** https://github.com/huggingface/alignment-handbook
+- **Paper:**
+- **Leaderboard:** https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
+- **Point of Contact:** Lewis Tunstall
+#### Dataset Summary
+No Robots is a high-quality dataset of 10,000 instructions and demonstrations created by skilled human annotators. This data can be used for supervised fine-tuning (SFT) to make language models follow instructions better. No Robots was modelled after the instruction dataset described in OpenAI's [InstructGPT paper](https://huggingface.co/papers/2203.02155), and is comprised mostly of single-turn instructions across the following categories:
+| Category   |   Count |
+|:-----------|--------:|
+| Generation |    4560 |
+| Open QA    |    1240 |
+| Brainstorm |    1120 |
+| Chat       |     850 |
+| Rewrite    |     660 |
+| Summarize  |     420 |
+| Coding     |     350 |
+| Classify   |     350 |
+| Closed QA  |     260 |
+| Extract    |     190 |
 ## Usage
 ```py