Checkpoint "step115000-tokens482B" identical to main model?

#6
by amodaresi - opened

Hi,
The hashes for checkpoint step115000-tokens482B and the main model show that these two models are identical. (The same goes for the other shards too.)
image.png
image.png

Is it really an early stop or a misupload?

Also I have noticed that the nitro model is also identical to the "step651581-tokens2731B" checkpoint.
What exactly is the nitro revision? Is it the model before it's further tuned with learning rate annealing, as described in the paper?

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment