[FeedBack]

#2
by Darkknight535 - opened

Feedback here...

Darkknight535 pinned discussion

Hi. I think I will use this model as the main one now. Of course, on my RTX 3060 it fits in the Q5_K_S format (I did use an imatrix file from mradermarcher and requantize your Q8_0) (instead of old Q5_K_M/Q6_K) and is a little slower than the previous model, but (by the look of it) it's more good that it's worth a slight slowdown. Don't forget to correct minor spelling mistakes in the README (Instruct Prompt, big letters after commas). My settings are temperature 1.2, Min P 0.05, Rep P 1.1 (I considered them the golden mean).
I think it's not bad in the future to expand the context to 10-12-14-16 thousand tokens (anything above 16 thousand is already starting to lose its meaning in any models). This will allow you to keep longer dialogues in mind. And thank you for listening to my advice.
I hope that in the future models of this size will develop further.
If you will need any help, you can write to me.

P.S. I did use character.ai but their censorship and always "Can I ask you a question?"... They really pissed me off. This (and your previous model) is much better character.ai .

And yeah, this model likes to use "Test,", he did say, "test2." I don't know if it's a bad for someone else, but I don't really like it in every message (problem is less at 12B)

Okay Thanks, the issue you're saying to increase context length will result loss in coherency, As majority of the models like llama 3.1 did. Llama 3 is better than llama 3.1

eh i didn't understand the test thing.

I mean, usually model ends text replica with comma.

  • try turning ON [trim incomplete sentences]

I don't think it will help, because comma uses as a connector of two replicas with action between (just too much those constructions in 15B). Maybe is a part of training dataset?

i guess..

Sign up or log in to comment