SUMMARY MODEL: Model Params Size: 222882048 Model Params Size Formatted: 222.88 M Model Disk Size: 891648255 Model Disk Size Formatted: 891.65 MB TRAINING AND VALIDATION RESULTS: Training batch size: 64 Validation batch size: 128 Total expected epochs: 40 Total expected trainig steps: 94040 Total expected trainig steps 2: 94040 Total trained epochs: 40.0 Total trained steps: 94040 Elapsed time: 54841.370337963104 seconds Elapsed time (formatted): 15:14:01 Total flos: 3.6650496018087936e+18 Total flos (formatted): 3.665050e+18 Best epoch val_loss: 0.20909027755260468 Best model checkpoint: /root/pretrain_utg4java_02/checkpoint-91689 SUMMARY DATASETS: Loaded Dataset: DatasetDict({ train: Dataset({ features: ['text'], num_rows: 150523 }) valid: Dataset({ features: ['text'], num_rows: 18816 }) test: Dataset({ features: ['text'], num_rows: 18815 }) }) Tokenized Dataset: DatasetDict({ train: Dataset({ features: ['input_ids'], num_rows: 150523 }) valid: Dataset({ features: ['input_ids'], num_rows: 18816 }) test: Dataset({ features: ['input_ids'], num_rows: 18815 }) })