text_shortening_model_v42

This model is a fine-tuned version of facebook/bart-large-xsum on the None dataset. It achieves the following results on the evaluation set:

Loss: 3.2972
Rouge1: 0.4588
Rouge2: 0.2356
Rougel: 0.4162
Rougelsum: 0.4165
Bert precision: 0.8664
Bert recall: 0.8655
Average word count: 8.5616
Max word count: 16
Min word count: 4
Average token count: 16.1051
% shortened texts with length > 12: 4.8048

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Bert precision	Bert recall	Average word count	Max word count	Min word count	Average token count	% shortened texts with length > 12
1.1087	1.0	73	2.0307	0.4468	0.2283	0.3951	0.394	0.8582	0.8635	8.5435	15	4	14.6997	3.6036
0.6451	2.0	146	2.0108	0.4629	0.2419	0.4159	0.4142	0.8724	0.8668	8.1081	17	5	14.7718	4.2042
0.4594	3.0	219	1.9499	0.4267	0.229	0.3887	0.3882	0.8579	0.8575	8.3093	16	5	13.976	1.8018
0.4681	4.0	292	2.0819	0.4127	0.2049	0.3734	0.372	0.8549	0.8543	8.3123	17	4	15.3514	3.6036
0.334	5.0	365	2.1413	0.4302	0.2184	0.3885	0.3886	0.857	0.8595	8.8589	15	4	14.5285	3.6036
0.296	6.0	438	2.0881	0.4716	0.2349	0.4216	0.4217	0.8684	0.8706	8.7928	16	5	15.0841	6.006
0.2588	7.0	511	2.2671	0.4517	0.2262	0.4085	0.4079	0.8654	0.8632	8.4985	14	4	14.8258	3.3033
0.1883	8.0	584	2.4313	0.4572	0.2369	0.409	0.4099	0.8646	0.867	8.7207	16	5	14.2192	4.2042
0.1822	9.0	657	2.3293	0.4413	0.2154	0.3943	0.3936	0.857	0.8619	8.8318	16	4	16.2973	6.006
0.1298	10.0	730	2.4037	0.4614	0.2303	0.4145	0.4144	0.8668	0.866	8.4715	18	4	15.8348	6.3063
0.1413	11.0	803	2.7031	0.4533	0.2337	0.4099	0.4095	0.8656	0.8637	8.2943	16	4	15.9009	4.2042
0.0786	12.0	876	2.5766	0.441	0.2218	0.3982	0.3982	0.8609	0.8613	8.5916	16	4	15.8228	3.6036
0.0662	13.0	949	2.8013	0.4408	0.2177	0.3989	0.3984	0.8573	0.8596	8.5946	15	4	16.4204	4.2042
0.0635	14.0	1022	2.8125	0.44	0.2265	0.3974	0.3975	0.8591	0.8618	8.8919	17	4	16.7898	4.5045
0.0648	15.0	1095	2.7665	0.4642	0.2371	0.42	0.4197	0.8662	0.8675	8.7477	16	4	15.6186	4.8048
0.0446	16.0	1168	3.1244	0.4599	0.2327	0.4211	0.4205	0.8656	0.8667	8.6396	16	4	16.1351	5.7057
0.0475	17.0	1241	3.3107	0.4626	0.24	0.422	0.4221	0.8673	0.8696	8.7027	16	5	16.3934	5.4054
0.0332	18.0	1314	3.1808	0.465	0.2413	0.4231	0.4231	0.8672	0.867	8.5315	16	5	16.048	5.1051
0.0252	19.0	1387	3.2446	0.4587	0.2315	0.4142	0.4143	0.866	0.8655	8.5586	16	4	16.012	4.8048
0.0294	20.0	1460	3.2972	0.4588	0.2356	0.4162	0.4165	0.8664	0.8655	8.5616	16	4	16.1051	4.8048

Framework versions

Transformers 4.33.1
Pytorch 2.0.1+cu118
Datasets 2.14.5
Tokenizers 0.13.3

ldos
/

text_shortening_model_v42

text_shortening_model_v42

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ldos/text_shortening_model_v42

Evaluation results