Undi95
/

Llama-3-LewdPlay-8B-GGUF

GGUF

Not-For-All-Audiences

nsfw

Inference Endpoints

Model card Files Files and versions Community

fp16?

by son-of-man - opened Apr 28

Discussion

son-of-man

Apr 28

It seems like there might be a more significant difference between quants (even Q8) and fp16 on llama3 than there was on llama2 or mistral.
Since it's such a small model, running an fp16 gguf isn't too hard on consumer hardware, so it seems like a worthwhile tradeoff for the increased nuance and coherence it might have.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment