Decent model but damn...
It's still under testing but:
Pro:
- Very decent output quality for a MS-24B fine tune
- Decent prompt adherence
Cons:
- Very bad at instruction following in ChatML Complete inability to properly title, and summarize a simple text even given examples
- Subpar instruction following in Tekken7. Fails the test more than 30% of the time. That's a very bad result for mistral-based model
- Failed basic menu navigation by adding a bunch of text everywhere despite instructions.
It's better, overall, than Sisyphus who was completely unusable. But you gotta boost it's IFEval a bit, it's really hurting the model.
We've been having a hell of a time with Mistral Small 3 in general. If it were as simple as tweaking our data then we'd be in a better spot, but Roselily is one instance of us just throwing stuff at the wall and seeing what sticks.
Also
Failed basic menu navigation by adding a bunch of text everywhere despite instructions.
I think you're trying to use it for something it's not meant for.
Also it turned out Sisyphus was just using completely the wrong dataset, something we only found out afterwards, so that it's not great at instruction following is no surprise.
Sisyphus who was completely unusable
:(
/lh
Yeah, I've noticed that MS-24B fine-tunes were all kinda subpar. There's something funky in that base model. "Thinking" models aside, this is the first one I found with a half decent / non looping writing style. It would really suck if the only way to get it to write and respond without the Mistral-isms was to murder the one thing its base model is really good at. :(
I think you're trying to use it for something it's not meant for.
Not really. Same methodology is applied to all the models I run, 'creative' or not. Most RP models I've seen are perfectly able to pick an option out of 4 in a very simple formatted prompt instead of telling me their life story, I assure you. It's nothing I can't fix with a bit of grammar, but I rarely have to rely on that fallback method. And that's the first time it ever happened to me on a Mistral base.