Context size
#6
by
LoSboccacc
- opened
I'm having some challenges with the model becoming incoherent after 12k token is there any specific consideration to apply for long context handling?
I can’t speak to any of the quants - but if you’re running it natively sglang is wonderful. I haven’t had issues there. Lower min p and lower temperature.