Arconte-13B
Arconte is Llama-2 merge. Arconte has many iterations, trying different recipes/models/merge-methods, in particular, iteration I and iteration Z are both models which showed promise. This version of Arconte is variation I redone with a more experimental approach to the merge recipe, and it shows great results.
Originally, the idea was to do one of those fancy Dare Ties model A, Dare Ties model B, Slerp model A + model B. I already did the Slerp model C, but it's flawed due to the flawed iterations I and Z. I still plan to do model C, so now I am remaking iteration Z.
Models used:
After completing model C, current roadmap is to either go into mistral merges, or trying my hand at making loras/qloras. No mixtral, nor anything above 13B parameters in the future due to hardware limitations.
All testing was done with Q5_K_M GUFF. I'll upload the full GUFF range along with an Imatrix version soon.
Update 3/30/24
I have tested this model further and I concluded that I find it boring. I remember I greenlighted this model because it was coherent (as much as a Q5_K_M can be), but now I think it's just not that good. But perhaps it is just my taste in models? or maybe my sampling settings are bad? I would like some feedback to know how good or bad this model is. I still plan to cook that C model, but I don't know if I will use this one to do it.
I will be releasing another model soon, an older model that I think is better than this one.
- Downloads last month
- 10