--- language: en license: mit library_name: pytorch --- # Plainly Optimized Network Dataset: BIGBENCH Trainer Hyperparameters: - `lr` = 5e-05 - `per_device_batch_size` = 8 - `gradient_accumulation_steps` = 2 - `weight_decay` = 0.0 - `seed` = 42 |eval_loss|eval_accuracy|epoch| |--|--|--| |10.379|0.571|1.0| |9.388|0.643|2.0| |10.286|0.571|3.0| |10.324|0.571|4.0| |10.254|0.571|5.0| |10.166|0.571|6.0| |10.122|0.571|7.0| |10.020|0.571|8.0| |10.035|0.571|9.0| |9.961|0.571|10.0| |9.963|0.571|11.0| |9.962|0.571|12.0| |9.990|0.500|13.0| |10.817|0.571|14.0| |10.030|0.571|15.0| |10.049|0.571|16.0| |10.057|0.571|17.0| |10.067|0.571|18.0| |10.080|0.571|19.0|