text_shortening_model_v43

This model is a fine-tuned version of facebook/bart-large-xsum on the None dataset. It achieves the following results on the evaluation set:

Loss: 2.8362
Rouge1: 0.4977
Rouge2: 0.2645
Rougel: 0.4429
Rougelsum: 0.4422
Bert precision: 0.8744
Bert recall: 0.8788
Average word count: 8.5344
Max word count: 18
Min word count: 4
Average token count: 15.9365
% shortened texts with length > 12: 8.4656

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Bert precision	Bert recall	Average word count	Max word count	Min word count	Average token count	% shortened texts with length > 12
0.5902	1.0	83	1.5909	0.4855	0.2475	0.4202	0.4201	0.8682	0.8736	8.5661	15	4	16.0635	3.9683
0.383	2.0	166	1.4957	0.516	0.2977	0.4569	0.4567	0.8751	0.881	8.8016	17	4	16.3519	8.4656
0.3301	3.0	249	1.6999	0.5073	0.2678	0.4401	0.4402	0.8662	0.8856	10.4233	22	5	17.9286	24.6032
0.3264	4.0	332	1.5703	0.5121	0.2818	0.4525	0.4527	0.8716	0.8844	9.1561	19	4	15.8704	12.4339
0.3901	5.0	415	1.6559	0.4875	0.2629	0.4362	0.4365	0.8661	0.8772	9.1111	16	5	15.2275	5.0265
0.2982	6.0	498	1.8927	0.499	0.267	0.4479	0.4476	0.8724	0.8824	9.0185	17	5	16.6376	10.0529
0.2864	7.0	581	1.8092	0.4961	0.2673	0.4377	0.4372	0.8705	0.8789	8.6614	17	5	14.4656	5.291
0.2059	8.0	664	2.0127	0.4921	0.2652	0.4408	0.4408	0.8729	0.8778	8.5899	16	4	15.2725	6.8783
0.1655	9.0	747	2.1199	0.4886	0.2697	0.4392	0.4391	0.8713	0.8777	8.7011	16	4	16.0132	7.4074
0.2361	10.0	830	2.0002	0.4814	0.2536	0.427	0.4257	0.8666	0.8769	8.9921	19	4	15.037	6.0847
0.2329	11.0	913	2.3033	0.4961	0.2725	0.4441	0.4426	0.8722	0.8775	8.6958	17	5	16.2619	10.582
0.1743	12.0	996	2.4562	0.499	0.275	0.4474	0.4477	0.8745	0.878	8.4127	17	4	15.873	9.2593
0.1716	13.0	1079	2.4160	0.4811	0.2528	0.4299	0.4297	0.8708	0.8751	8.4735	16	4	16.0873	6.0847
0.1394	14.0	1162	2.3996	0.4783	0.2445	0.4214	0.4205	0.8686	0.8735	8.6587	19	5	15.6376	8.9947
0.0769	15.0	1245	2.8364	0.4902	0.258	0.4369	0.4362	0.8697	0.8767	8.7222	18	4	16.4286	9.5238
0.1039	16.0	1328	2.5845	0.5009	0.267	0.4473	0.4464	0.8757	0.88	8.5291	18	4	16.0688	8.7302
0.098	17.0	1411	2.7602	0.491	0.2628	0.4379	0.4377	0.8711	0.8779	8.6587	18	4	16.2249	9.7884
0.0879	18.0	1494	2.6813	0.4987	0.2679	0.4468	0.4471	0.8761	0.8793	8.3862	18	4	15.4735	7.9365
0.0945	19.0	1577	2.8612	0.5034	0.2703	0.4489	0.449	0.8762	0.8806	8.5582	19	4	16.0873	8.4656
0.0702	20.0	1660	2.8362	0.4977	0.2645	0.4429	0.4422	0.8744	0.8788	8.5344	18	4	15.9365	8.4656

Framework versions

Transformers 4.33.1
Pytorch 2.0.1+cu118
Datasets 2.14.5
Tokenizers 0.13.3

ldos
/

text_shortening_model_v43

text_shortening_model_v43

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ldos/text_shortening_model_v43

Evaluation results