Congratulations!
Congratulations! Average 80.48
@LoneStriker would definitely love to try an exl2 quant of this, better if you can make a 8.0bpw one.
First model to reach 80%!
@LoneStriker would definitely love to try an exl2 quant of this, better if you can make a 8.0bpw one.
Qwen 72B is not yet supported by exl2. I'll quantize the model if/when it is supported; I've been wanting to run it with exl2 myself since it came out...
Nice!
This model's derived from Qwen-72B, so take the scores with a grain of salt. Qwen is one of those base models that likely included test data in their pretraining, so apply a handicap to other models for a fair comparison.
Regardless, thanks for sharing this new model @ArkaAbacus and team @abacusai! :)
If you've the spare compute to take requests / challenges, I'm very curious to see if your training method can improve upon https://huggingface.co/allenai/tulu-2-dpo-70b, a Llama-2-70b type model, for a more direct comparison of efficacy in pushing the envelope.
@Ont
Qwen-72 is doing really good on EQ bench which is definitely not the result of training on test data.
https://eqbench.com/
Just ran the fresh correlations to Arena Elo and EQ looks really promising.
Spearman Correlations:
EQ-bench v2: 0.863
MT-bench: 0.891
Alpaca v2: 0.899
Kendall's Tau:
EQ-bench v2: 0.730
MT-bench: 0.759
Alpaca v2: 0.759
Now does this mean that the base model does well on everything? Definitely not, but it shows that it's not simply a number gymnastics model. Although whoever tried Qwen knows this already probably.
(Also notice the lot of dolphins up there on that leaderboard. I don't know how much contribution @ehartford had to this model, but Qwen + the marine biologist guy looks like a good combination to me).
@LoneStriker would definitely love to try an exl2 quant of this, better if you can make a 8.0bpw one.
Qwen 72B is not yet supported by exl2. I'll quantize the model if/when it is supported; I've been wanting to run it with exl2 myself since it came out...
I think this is a llamafied version. It just uses a different tokenizer, so it cannot be converted to gguf but possibly exl2?
ex2 quant fails unfortunately. Even with the llama.cpp GGUF conversion, I was able to get the model to convert, but the resulting GGUF file was not loadable for me, so I took my GGUF quants offline for now until I can figure out why it's not loading.
(Also notice the lot of dolphins up there on that leaderboard. I don't know how much contribution @ehartford had to this model, but Qwen + the marine biologist guy looks like a good combination to me).
This work is unrelated - led by @ArkaAbacus