Interesting Model

#1
by GlobalMeltdown - opened

Hey so first thank you for sharing.

I was curious is this supposed to be a Reasoning Distill or just a model trained using new synth data from those models?

I've only just downloaded it, but it's hard to determine lately between what people are putting out lately. I'm really curious about it though because I do see a lot of Llama 3 as a base, it's right at the 8b mark with 33 layers and for a person like me with 8gb of Vram that makes it more appealing.

(Unrelated to this repo obviously): One weird thing that's probably no worth noting, is if people use the suggestion in KoboldCPP for layers, ignore it because for some reason it estimated way less layers than it should for an 8b. I think for 8192 context using the Q5 K_M it showed 12... which was weird (I usually set it myself but just a heads up for those who go for the GGUF's,) I loaded the full 33 and probably could go up a couple quants.

Edit: Also how are people using this successfully in sillytavernAI without responses being 90% thoughts and 10% Dialogue/actions/narration? Most of the time it seems like its 100% thoughts. Any advice would be great.

Sign up or log in to comment