euclaise
/

Memphis-CoT-3B

@@ -16,7 +16,7 @@ metrics:
 Memphis-CoT is a finetune of [StableLM 3b 4e1t](stabilityai/stablelm-3b-4e1t) on [TinyCoT](https://huggingface.co/datasets/euclaise/TinyCoT), along with [reddit-instruct](https://huggingface.co/datasets/euclaise/reddit-instruct) (subset to 5000 examples, excluding posts with brackets in the title) and a [curated](https://huggingface.co/datasets/sablo/oasst2_curated) subset of [oasst2](https://huggingface.co/datasets/OpenAssistant/oasst2).
-**Memphis was trained *only* on human data and OpenAssistant completions! No GPT generations here.**
 Finetuning was performed using my [supertrainer2000](https://github.com/euclaise/supertrainer2000) framework, using my Adalite optimizer.
@@ -65,7 +65,7 @@ The format for TinyCoT was:
 | [MPT 7B Instruct](https://hf.co/mosaicml/mpt-7b-instruct)              | **7B** | **Human**+Anthropic | SFT           |    2.05%       | 24.12%                                  | 11.01%                          |
 | [OpenLLaMA 7B v2 open-instruct](http://hf.co/VMware/open-llama-7b-v2-open-instruct) | **7B** | **Human** (nearly: ecqa is an exception) | SFT | 8.64% | 23.21%                   | 29.84%                          |
 | [StableLM Zephyr 3B](https://hf.co/stabilityai/stablelm-zephyr-3b)     | 3B     | GPT                 | DPO           |    possibly contaminated (45.72%)  | **33.31%**                   | 0.91%                           |
-| [**Memphis-CoT 3B**](https://hf.co/euclaise/memphis-cot-3b)            | 3B     | **Human**+OASST           | Self-teaching |    **13.8%**       | *26.24%*                            | **38.24%**                      |
 *5-shot, as performed automatically by LM Evaluation Harness bbh_cot_fewshot even with num_fewshot=0
 Memphis outperforms other primarily-human-data models that are over twice its size, along with SFT models of its size, and trades with the Zephyr DPO model. That said, Zephyr uses synthetic data, and *much* more of it.

 Memphis-CoT is a finetune of [StableLM 3b 4e1t](stabilityai/stablelm-3b-4e1t) on [TinyCoT](https://huggingface.co/datasets/euclaise/TinyCoT), along with [reddit-instruct](https://huggingface.co/datasets/euclaise/reddit-instruct) (subset to 5000 examples, excluding posts with brackets in the title) and a [curated](https://huggingface.co/datasets/sablo/oasst2_curated) subset of [oasst2](https://huggingface.co/datasets/OpenAssistant/oasst2).
+**Memphis was trained *only* on human data! No GPT generations here.**
 Finetuning was performed using my [supertrainer2000](https://github.com/euclaise/supertrainer2000) framework, using my Adalite optimizer.
 | [MPT 7B Instruct](https://hf.co/mosaicml/mpt-7b-instruct)              | **7B** | **Human**+Anthropic | SFT           |    2.05%       | 24.12%                                  | 11.01%                          |
 | [OpenLLaMA 7B v2 open-instruct](http://hf.co/VMware/open-llama-7b-v2-open-instruct) | **7B** | **Human** (nearly: ecqa is an exception) | SFT | 8.64% | 23.21%                   | 29.84%                          |
 | [StableLM Zephyr 3B](https://hf.co/stabilityai/stablelm-zephyr-3b)     | 3B     | GPT                 | DPO           |    possibly contaminated (45.72%)  | **33.31%**                   | 0.91%                           |
+| [**Memphis-CoT 3B**](https://hf.co/euclaise/memphis-cot-3b)            | 3B     | **Human**           | Self-teaching |    **13.8%**       | *26.24%*                            | **38.24%**                      |
 *5-shot, as performed automatically by LM Evaluation Harness bbh_cot_fewshot even with num_fewshot=0
 Memphis outperforms other primarily-human-data models that are over twice its size, along with SFT models of its size, and trades with the Zephyr DPO model. That said, Zephyr uses synthetic data, and *much* more of it.