smol_llama-220M-GQA-fineweb_edu / train_results.json
pszemraj's picture
End of training
83c5d1d verified
raw
history blame
296 Bytes
{
"epoch": 0.9999939379610938,
"num_input_tokens_seen": 10810818560,
"total_flos": 1.199767969182253e+19,
"train_loss": 2.7848671950609445,
"train_runtime": 243040.2706,
"train_samples": 5278746,
"train_samples_per_second": 21.72,
"train_steps_per_second": 0.085
}