llm-jp-3-13b-instruct2-grpo-R1-0223_step800 / model-00001-of-00006.safetensors

Commit History

Trained with Unsloth
faa04c7
verified

morizon commited on