Update README.md
Browse files
README.md
CHANGED
@@ -24,7 +24,7 @@ base_model:
|
|
24 |
|
25 |
<p>
|
26 |
<b>Version 0.1 notes:</b><br> Dataset was deduped and cleaned from version 0.0, sequence length was also increased. Resulting model seems to be stabler, and 0.0 problems with handling short inputs and min_p sampling seem to be gone.<br>
|
27 |
-
This version seems to be more or less optimal for the current data
|
28 |
</p>
|
29 |
|
30 |
<p>Note: using quantized KV cache with Qwen2.5 <b>is not recommended</b> and can lead to degraded output quality. On the other hand, Qwen's KV cache is already light enough, so using f16 for it shouldn't be problematic.</p>
|
|
|
24 |
|
25 |
<p>
|
26 |
<b>Version 0.1 notes:</b><br> Dataset was deduped and cleaned from version 0.0, sequence length was also increased. Resulting model seems to be stabler, and 0.0 problems with handling short inputs and min_p sampling seem to be gone.<br>
|
27 |
+
This version seems to be more or less optimal for the current data and available compute.
|
28 |
</p>
|
29 |
|
30 |
<p>Note: using quantized KV cache with Qwen2.5 <b>is not recommended</b> and can lead to degraded output quality. On the other hand, Qwen's KV cache is already light enough, so using f16 for it shouldn't be problematic.</p>
|