Update README.md
Browse files
README.md
CHANGED
@@ -70,7 +70,7 @@ The format for TinyCoT was:
|
|
70 |
|
71 |
Memphis outperforms human-data models that are over twice its size, along with SFT models of its size, and trades with the Zephyr DPO model. That said, Zephyr uses synthetic data, and *much* more of it.
|
72 |
|
73 |
-
Note that BBH results have wide SEs, exceeding 16%.
|
74 |
|
75 |
|
76 |
It is unclear why Zephyr performs so poorly on BBH. Perhaps it is overfit, or maybe there was an issue with vllm.
|
|
|
70 |
|
71 |
Memphis outperforms human-data models that are over twice its size, along with SFT models of its size, and trades with the Zephyr DPO model. That said, Zephyr uses synthetic data, and *much* more of it.
|
72 |
|
73 |
+
Note that BBH results have wide SEs, sometimes even exceeding 16%.
|
74 |
|
75 |
|
76 |
It is unclear why Zephyr performs so poorly on BBH. Perhaps it is overfit, or maybe there was an issue with vllm.
|