GPU offload can result in gibberish

by slaw0mir - opened Nov 27, 2024

Nov 27, 2024

I really like this model. One thing I noticed is that too much gpu offload seams to break it, somewhat similarly to this post:
https://huggingface.co/TheBloke/guanaco-7B-GGML/discussions/2

In my case I get "GroupGroup.." like endlessly repeating words. Offloading 32 layers seams to be the sweat spot for me, above 64 its completely broken, around 48 the output starts losing formatting strings and random words turn out like this: "dr 1 nc 4ed"

DavidAU

Owner Nov 27, 2024

•

edited Nov 27, 2024

Make sure you see this document, as this model is a class 3/4 :

https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters

This doc will outline exact settings to use to "tame" this type of model ; including "base line" settings and advanced.
By "tame" I mean "errors" / "repeats" and other issues.
A lot of times just getting the basic parameters (especially penalties) is critical.

RE: Offloading ;
There are math differences between GPU and CPU - which can create "odd math issues"; often this issue can show up with operational problems.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment