AuriAetherwiing
commited on
Commit
•
39c9cb4
1
Parent(s):
5496a82
Update README.md
Browse files
README.md
CHANGED
@@ -27,7 +27,7 @@ base_model:
|
|
27 |
This version seems to be more or less optimal for the current data. It (again) started crashing on each checkpoint after some point, but it was less of a problem this time, as eval/loss already flatlined by that time. This is epoch 2.7 checkpoint.
|
28 |
</p>
|
29 |
|
30 |
-
<p>Note: using quantized KV cache with Qwen2.5 <b>is not recommended</b> and can lead to degraded output quality. On the other hand, Qwen's KV cache is already light enough
|
31 |
|
32 |
<p>
|
33 |
<p>Prompt format is ChatML.</p><br>
|
|
|
27 |
This version seems to be more or less optimal for the current data. It (again) started crashing on each checkpoint after some point, but it was less of a problem this time, as eval/loss already flatlined by that time. This is epoch 2.7 checkpoint.
|
28 |
</p>
|
29 |
|
30 |
+
<p>Note: using quantized KV cache with Qwen2.5 <b>is not recommended</b> and can lead to degraded output quality. On the other hand, Qwen's KV cache is already light enough, so using f16 for it shouldn't be problematic.</p>
|
31 |
|
32 |
<p>
|
33 |
<p>Prompt format is ChatML.</p><br>
|