text_shortening_model_v41

This model is a fine-tuned version of facebook/bart-large-xsum on the None dataset. It achieves the following results on the evaluation set:

Loss: 3.7205
Rouge1: 0.4471
Rouge2: 0.2088
Rougel: 0.3939
Rougelsum: 0.3941
Bert precision: 0.8647
Bert recall: 0.8624
Average word count: 8.6517
Max word count: 18
Min word count: 4
Average token count: 16.5045
% shortened texts with length > 12: 5.7057

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 25

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Bert precision	Bert recall	Average word count	Max word count	Min word count	Average token count	% shortened texts with length > 12
2.424	1.0	73	2.2763	0.4466	0.2286	0.3969	0.3973	0.8628	0.8607	8.4805	17	5	14.6967	3.6036
1.331	2.0	146	2.1237	0.4671	0.2385	0.4119	0.4124	0.86	0.8702	9.7117	20	4	16.7838	14.4144
0.9725	3.0	219	1.9947	0.448	0.2384	0.4004	0.4025	0.8603	0.8627	8.8649	16	5	15.8709	5.7057
0.7753	4.0	292	2.2302	0.4435	0.2201	0.3983	0.3991	0.8653	0.8588	8.1141	16	5	15.5526	1.8018
0.6017	5.0	365	2.1392	0.4293	0.2142	0.383	0.3836	0.8593	0.8604	8.6156	17	4	14.1982	3.3033
0.4911	6.0	438	2.4747	0.4166	0.1882	0.365	0.3668	0.8582	0.8556	8.4234	14	5	14.4024	3.6036
0.6947	7.0	511	2.6372	0.3894	0.1904	0.3527	0.3534	0.8471	0.8477	8.5165	14	4	16.6607	4.2042
0.5839	8.0	584	2.6038	0.3641	0.1627	0.3272	0.3276	0.8464	0.8402	7.7508	13	4	15.2342	0.6006
0.4668	9.0	657	2.7711	0.4015	0.1904	0.3627	0.3626	0.8537	0.8517	8.8889	17	4	16.2402	3.9039
0.4539	10.0	730	2.8819	0.4	0.1903	0.353	0.3538	0.8526	0.8519	8.6156	15	5	16.1652	3.9039
0.4018	11.0	803	2.8273	0.3799	0.1764	0.3404	0.3407	0.8432	0.8454	8.7177	17	4	17.0661	3.6036
0.2764	12.0	876	2.9767	0.3888	0.1825	0.3504	0.3509	0.8526	0.8475	8.4354	13	5	16.015	2.1021
0.2338	13.0	949	2.8883	0.4184	0.202	0.3714	0.3714	0.852	0.8585	9.3754	17	5	15.8709	8.4084
0.1878	14.0	1022	3.1069	0.4302	0.1966	0.3782	0.3791	0.8616	0.8573	8.4324	15	4	16.2492	3.3033
0.1608	15.0	1095	2.8510	0.4461	0.2151	0.392	0.3925	0.8627	0.8625	8.7598	19	4	16.1471	5.7057
0.1416	16.0	1168	3.0792	0.4246	0.1983	0.3735	0.3735	0.8591	0.8568	8.6637	16	5	16.3303	7.5075
0.1507	17.0	1241	3.2058	0.4336	0.2016	0.379	0.3796	0.8593	0.8589	8.9129	17	5	16.6697	5.1051
0.108	18.0	1314	3.0551	0.4485	0.2248	0.4002	0.4006	0.8645	0.8608	8.2492	14	5	15.967	3.6036
0.0756	19.0	1387	3.1943	0.4439	0.2167	0.3919	0.3925	0.8652	0.8608	8.4865	15	5	15.8919	3.9039
0.104	20.0	1460	3.1156	0.4411	0.2035	0.3894	0.3903	0.8644	0.8612	8.5135	16	5	16.4294	6.006
0.0716	21.0	1533	3.4040	0.4389	0.201	0.3824	0.3838	0.8632	0.8614	8.7508	16	4	16.5075	6.006
0.0576	22.0	1606	3.4264	0.4476	0.2104	0.3902	0.391	0.8657	0.8629	8.5405	16	4	16.4144	6.6066
0.041	23.0	1679	3.5711	0.447	0.2108	0.3931	0.393	0.8639	0.8619	8.5976	18	4	16.4264	7.2072
0.0355	24.0	1752	3.6294	0.4509	0.215	0.3981	0.3989	0.8652	0.8632	8.6186	18	4	16.4985	6.006
0.0313	25.0	1825	3.7205	0.4471	0.2088	0.3939	0.3941	0.8647	0.8624	8.6517	18	4	16.5045	5.7057

Framework versions

Transformers 4.33.1
Pytorch 2.0.1+cu118
Datasets 2.14.5
Tokenizers 0.13.3

ldos
/

text_shortening_model_v41

text_shortening_model_v41

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ldos/text_shortening_model_v41

Evaluation results