Update README.md
Browse files
README.md
CHANGED
@@ -65,7 +65,7 @@ The format for TinyCoT was:
|
|
65 |
| [MPT 7B Instruct](mosaicml/mpt-7b-instruct) | **7B** | **Human**+Anthropic | SFT | 2.05% | 24.12% |
|
66 |
| [OpenLLaMA 7B v2 open-instruct](http://hf.co/VMware/open-llama-7b-v2-open-instruct) | **7B** | **Human** (nearly: ecqa is an exception) | SFT | 8.64% | 23.21% | 16.21% |
|
67 |
| [StableLM Zephyr 3B](https://hf.co/stabilityai/stablelm-zephyr-3b) | 3B | GPT | DPO | contaminated (45.72%) | **33.31%** | 0.91% |
|
68 |
-
| [**Memphis-CoT 3B**](https://hf.co/euclaise/memphis-cot-3b) | 3B | **Human** | Self-teaching | **13.8%** | *26.24%* | 38.24
|
69 |
*5-shot, as performed automatically by LM Evaluation Harness bbh_cot_fewshot even with num_fewshot=0
|
70 |
|
71 |
Memphis outperforms human-data models that are over twice its size, along with SFT models of its size, and trades with the Zephyr DPO model. That said, Zephyr uses synthetic data, and *much* more of it.
|
|
|
65 |
| [MPT 7B Instruct](mosaicml/mpt-7b-instruct) | **7B** | **Human**+Anthropic | SFT | 2.05% | 24.12% |
|
66 |
| [OpenLLaMA 7B v2 open-instruct](http://hf.co/VMware/open-llama-7b-v2-open-instruct) | **7B** | **Human** (nearly: ecqa is an exception) | SFT | 8.64% | 23.21% | 16.21% |
|
67 |
| [StableLM Zephyr 3B](https://hf.co/stabilityai/stablelm-zephyr-3b) | 3B | GPT | DPO | contaminated (45.72%) | **33.31%** | 0.91% |
|
68 |
+
| [**Memphis-CoT 3B**](https://hf.co/euclaise/memphis-cot-3b) | 3B | **Human** | Self-teaching | **13.8%** | *26.24%* | **38.24%** |
|
69 |
*5-shot, as performed automatically by LM Evaluation Harness bbh_cot_fewshot even with num_fewshot=0
|
70 |
|
71 |
Memphis outperforms human-data models that are over twice its size, along with SFT models of its size, and trades with the Zephyr DPO model. That said, Zephyr uses synthetic data, and *much* more of it.
|