QMB15
/

mythomax-13B-8.13bit-MAX-exl2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

QMB15 commited on Sep 17, 2023

Commit

3ad2867

•

1 Parent(s): 2b241dd

Update README.md

Files changed (1) hide show

README.md +13 -0

README.md CHANGED Viewed

@@ -3,6 +3,19 @@ license: other
 language:
 - en
 ---
 An improved, potentially even perfected variant of MythoMix, my [MythoLogic-L2](https://huggingface.co/Gryphe/MythoLogic-L2-13b) and [Huginn](https://huggingface.co/The-Face-Of-Goonery/Huginn-13b-FP16) merge using a highly experimental tensor type merge technique. The main difference with MythoMix is that I allowed more of Huginn to intermingle with the single tensors located at the front and end of a model, resulting in increased coherency across the entire structure.
 The script and the acccompanying templates I used to produce both can [be found here](https://github.com/Gryphe/BlockMerge_Gradient/tree/main/YAML).

 language:
 - en
 ---
+This is an exllama V2 quantization of https://huggingface.co/Gryphe/MythoMax-L2-13b
+This particular version is designed for maximum quality at the cost of size.
+I noticed that the previous 8bpw version was using a small bitrate for some layers, and reported a lower quantized ppl than its base ppl, implying that the layer optimizer was overfitting to the dataset.
+In response, I edited measurement.json to add +1 error to all bitrates except for 8.13 (the max).
+(Don't reuse that file for other quants!!)
+That means this version uses the best 8bit-32g quantization mode for all layers. In out of sample tests, this squeezes out just a little better perplexity than the 8bit version.
+Calibration data: https://huggingface.co/datasets/wikitext/resolve/refs%2Fconvert%2Fparquet/wikitext-2-v1/test/0000.parquet
 An improved, potentially even perfected variant of MythoMix, my [MythoLogic-L2](https://huggingface.co/Gryphe/MythoLogic-L2-13b) and [Huginn](https://huggingface.co/The-Face-Of-Goonery/Huginn-13b-FP16) merge using a highly experimental tensor type merge technique. The main difference with MythoMix is that I allowed more of Huginn to intermingle with the single tensors located at the front and end of a model, resulting in increased coherency across the entire structure.
 The script and the acccompanying templates I used to produce both can [be found here](https://github.com/Gryphe/BlockMerge_Gradient/tree/main/YAML).