Update README.md
Browse files
README.md
CHANGED
@@ -13,6 +13,22 @@ library_name: transformers
|
|
13 |
|
14 |
# OpenChat-3.5-0106_32K-PoSE
|
15 |
|
|
|
|
|
16 |
This model is [Openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106) with the context length extended from 8192 tokens to 32768 tokens using [PoSE](https://huggingface.co/papers/2309.10400).
|
17 |
|
18 |
The model was fine-tuned using [Rank-Stabilized LoRA](https://huggingface.co/blog/damjan-k/rslora) and the [LongAlpaca-12K](Yukang/LongAlpaca-12k) dataset. I hope to continue extending the context in future versions and then apply the same methods to my [upscaled versions of OpenChat-3.5](https://huggingface.co/collections/Pretergeek/openchat-35-0106-with-additional-layers-66a8d3262c7c3ebdd7783a29).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
|
14 |
# OpenChat-3.5-0106_32K-PoSE
|
15 |
|
16 |
+
## Description
|
17 |
+
|
18 |
This model is [Openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106) with the context length extended from 8192 tokens to 32768 tokens using [PoSE](https://huggingface.co/papers/2309.10400).
|
19 |
|
20 |
The model was fine-tuned using [Rank-Stabilized LoRA](https://huggingface.co/blog/damjan-k/rslora) and the [LongAlpaca-12K](Yukang/LongAlpaca-12k) dataset. I hope to continue extending the context in future versions and then apply the same methods to my [upscaled versions of OpenChat-3.5](https://huggingface.co/collections/Pretergeek/openchat-35-0106-with-additional-layers-66a8d3262c7c3ebdd7783a29).
|
21 |
+
|
22 |
+
## Citations
|
23 |
+
```
|
24 |
+
@misc{zhu2024poseefficientcontextwindow,
|
25 |
+
title={PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training},
|
26 |
+
author={Dawei Zhu and Nan Yang and Liang Wang and Yifan Song and Wenhao Wu and Furu Wei and Sujian Li},
|
27 |
+
year={2024},
|
28 |
+
eprint={2309.10400},
|
29 |
+
archivePrefix={arXiv},
|
30 |
+
primaryClass={cs.CL},
|
31 |
+
url={https://arxiv.org/abs/2309.10400},
|
32 |
+
}
|
33 |
+
```
|
34 |
+
|