astrollama-3-8b-base_aic / all_results.json
research4pan's picture
First model version
b64b3ce
raw
history blame contribute delete
233 Bytes
{
"epoch": 1.0,
"total_flos": 208699171799040.0,
"train_loss": 1.9003729102663567,
"train_runtime": 27485.2321,
"train_samples": 255099,
"train_samples_per_second": 9.281,
"train_steps_per_second": 0.048
}