This repository contains FP16 logits produced via the llama.cpp perplexity with wikitext-2-raw/wiki.test.raw. By using the logits as input the KL divergence for a quantized model can be calculated without the need to run the model at FP16.

Important: The logits I previously uploaded for LLaMA 3 Instruct 70b FP16 may have been affected by hardware instability issues and any conclusions drawn from them may be incorrect. I therefore deleted the file.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.