docs: add info about context length and link to GGUF
Browse files
README.md
CHANGED
@@ -24,6 +24,8 @@ The speed may also be a bit faster, especially if you use frameworks optimized f
|
|
24 |
- **Training Duration:** 1.5 epochs
|
25 |
- **Hardware Used:** 8x AMD Instinct™ MI300X Accelerators
|
26 |
|
|
|
|
|
27 |
## Prompting
|
28 |
|
29 |
The model uses ChatML formatting for instructions. A typical input would look like this:
|
@@ -50,6 +52,8 @@ This version of the model has been converted to the LLaMA format to enhance comp
|
|
50 |
|
51 |
Can be used in transformers or any software that supports LLaMA arch models.
|
52 |
|
|
|
|
|
53 |
## Limitations
|
54 |
|
55 |
Users should be aware that while this converted model maintains the general capabilities of the original, there might be subtle differences in performance or behavior due to the format change. It's recommended to test the model for your specific use case.
|
|
|
24 |
- **Training Duration:** 1.5 epochs
|
25 |
- **Hardware Used:** 8x AMD Instinct™ MI300X Accelerators
|
26 |
|
27 |
+
Context length is reduced to 32k, not sure how the sliding window implementation should be translated (afaik LLaMA doesn't use this).
|
28 |
+
|
29 |
## Prompting
|
30 |
|
31 |
The model uses ChatML formatting for instructions. A typical input would look like this:
|
|
|
52 |
|
53 |
Can be used in transformers or any software that supports LLaMA arch models.
|
54 |
|
55 |
+
You can download GGUF quantizations here: [leafspark/magnum-72b-v1-llamaify-GGUF](https://huggingface.co/leafspark/magnum-72b-v1-llamaify-GGUF)
|
56 |
+
|
57 |
## Limitations
|
58 |
|
59 |
Users should be aware that while this converted model maintains the general capabilities of the original, there might be subtle differences in performance or behavior due to the format change. It's recommended to test the model for your specific use case.
|