chargoddard
/

llama33b-s2a4-qlora

Text Generation

Model card Files Files and versions Community

chargoddard commited on Jul 16, 2023

Commit

5d7eee1

·

1 Parent(s): b3605d0

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -4,13 +4,14 @@ datasets:
 - EleutherAI/wikitext_document_level
 language:
 - en
 ---
 [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
 LLaMA 33b finetuned on `wikitext_document_level` with a combination of both linear and NTK-aware ROPE scaling.
-Trained with alpha=4, scale=2.
 <img src="llama33b-s2a4-qlora/resolve/main/perplexity.png" alt="Perplexity Graph" />
@@ -30,4 +31,4 @@ The following `bitsandbytes` quantization config was used during training:
 ### Framework versions
-- PEFT 0.4.0.dev0

 - EleutherAI/wikitext_document_level
 language:
 - en
+pipeline_tag: text-generation
 ---
 [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
 LLaMA 33b finetuned on `wikitext_document_level` with a combination of both linear and NTK-aware ROPE scaling.
+Trained with alpha=4, scale=2. Definitely works for sequence lengths up to and including 4096. Might work for much longer, but I don't have the VRAM to test properly. ¯\\\_(ツ)\_/¯
 <img src="llama33b-s2a4-qlora/resolve/main/perplexity.png" alt="Perplexity Graph" />
 ### Framework versions
+- PEFT 0.4.0.dev0