kennethge123
/

sst5-t5-base-kd

Model card Files Files and versions Community

Plainly Optimized Network

Dataset: BIGBENCH

Trainer Hyperparameters:

lr = 5e-05
per_device_batch_size = 1
gradient_accumulation_steps = 4
weight_decay = 1e-09
seed = 42

eval_loss	eval_accuracy	epoch
58.940	0.054	1.0
54.182	0.049	2.0
56.362	0.051	3.0
52.705	0.046	4.0
55.357	0.050	5.0
53.973	0.048	6.0
56.034	0.050	7.0
51.731	0.045	8.0
54.661	0.048	9.0
50.378	0.043	10.0
51.579	0.044	11.0
51.193	0.044	12.0
52.724	0.046	13.0
52.055	0.045	14.0
51.406	0.044	15.0
51.539	0.045	16.0
52.422	0.046	17.0
50.304	0.043	18.0
50.937	0.044	19.0

Downloads last month: 2

Inference Providers NEW

This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model’s pipeline type.