text_shortening_model_v40

This model is a fine-tuned version of facebook/bart-large-xsum on the None dataset. It achieves the following results on the evaluation set:

Loss: 3.3335
Rouge1: 0.4511
Rouge2: 0.2377
Rougel: 0.4039
Rougelsum: 0.4038
Bert precision: 0.8635
Bert recall: 0.8629
Average word count: 8.5826
Max word count: 16
Min word count: 5
Average token count: 16.5616
% shortened texts with length > 12: 4.8048

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Bert precision	Bert recall	Average word count	Max word count	Min word count	Average token count	% shortened texts with length > 12
3.0922	1.0	73	2.2144	0.4539	0.2272	0.4068	0.4055	0.8657	0.8684	8.7027	15	5	14.3423	4.2042
1.75	2.0	146	2.0055	0.4658	0.2381	0.4085	0.4088	0.8654	0.8656	8.7087	16	5	15.1652	4.8048
1.311	3.0	219	2.0021	0.456	0.2257	0.4124	0.4117	0.8644	0.8646	8.6396	15	5	15.9279	5.1051
1.0163	4.0	292	2.0698	0.467	0.2403	0.4159	0.4162	0.8636	0.8699	9.2973	16	5	17.2162	9.9099
0.8546	5.0	365	2.0707	0.4527	0.2392	0.4129	0.4126	0.8637	0.8647	8.4895	17	4	16.3153	4.8048
0.7222	6.0	438	2.1452	0.4562	0.2349	0.4077	0.4064	0.8693	0.8623	8.021	15	4	14.1051	1.2012
0.5723	7.0	511	2.3520	0.4563	0.2403	0.4142	0.413	0.8666	0.8658	8.5916	16	5	16.5465	6.9069
0.5274	8.0	584	2.2896	0.4502	0.2434	0.4077	0.4078	0.8639	0.8639	8.5586	14	5	14.8048	2.1021
0.3767	9.0	657	2.2928	0.4565	0.2368	0.4125	0.4114	0.8682	0.8623	8.0691	14	4	14.4204	1.8018
0.2987	10.0	730	2.5411	0.4539	0.2383	0.4057	0.4056	0.8652	0.8631	8.5826	15	5	15.6637	4.5045
0.2319	11.0	803	2.8995	0.4513	0.2367	0.4069	0.4068	0.8631	0.8622	8.6607	17	5	16.4535	5.7057
0.2167	12.0	876	2.7950	0.4632	0.2521	0.4163	0.4162	0.8673	0.8679	8.7267	16	4	16.3243	6.3063
0.1952	13.0	949	2.6240	0.4537	0.2396	0.406	0.4059	0.8632	0.8648	8.8258	18	5	16.2613	7.8078
0.1395	14.0	1022	2.8894	0.4588	0.2412	0.4141	0.4144	0.864	0.8658	8.6216	15	5	16.6426	3.6036
0.1298	15.0	1095	2.7580	0.4562	0.2384	0.4085	0.4088	0.8661	0.8659	8.5586	15	5	16.3634	5.4054
0.1044	16.0	1168	2.7724	0.466	0.2527	0.4175	0.4171	0.8677	0.8694	8.7387	15	4	16.4535	5.1051
0.0944	17.0	1241	2.9161	0.4429	0.232	0.3986	0.3986	0.8619	0.8621	8.6306	16	5	16.5255	5.4054
0.077	18.0	1314	3.1718	0.4549	0.2372	0.4054	0.4052	0.863	0.8639	8.6456	15	5	16.7447	4.8048
0.0561	19.0	1387	3.2650	0.4581	0.2413	0.4092	0.4089	0.866	0.865	8.5195	16	5	16.4174	4.8048
0.0542	20.0	1460	3.3335	0.4511	0.2377	0.4039	0.4038	0.8635	0.8629	8.5826	16	5	16.5616	4.8048

Framework versions

Transformers 4.33.1
Pytorch 2.0.1+cu118
Datasets 2.14.5
Tokenizers 0.13.3

ldos
/

text_shortening_model_v40

text_shortening_model_v40

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ldos/text_shortening_model_v40

Evaluation results