Safetensors
English
olmo2

Why don't the available checkpoints start from a lower amount of steps/seen data?

#6
by user09180912480 - opened

7B model has checkpoint from 150-1000 steps with only 1B tokens seen. Why is it not the same for the 13B model?

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment