Context size

#6
by LoSboccacc - opened

I'm having some challenges with the model becoming incoherent after 12k token is there any specific consideration to apply for long context handling?

I can’t speak to any of the quants - but if you’re running it natively sglang is wonderful. I haven’t had issues there. Lower min p and lower temperature.

Sign up or log in to comment