jogonba2
/

barthez-deft-archeologie

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

barthez-deft-archeologie / README.md

José Ángel González

Update README.md

4b00ada over 2 years ago

|

3.89 kB

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: barthez-deft-archeologie
	results:
	- task:
	name: Summarization
	type: summarization
	metrics:
	- name: Rouge1
	type: rouge
	value: 37.1845
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# barthez-deft-archeologie

	This model is a fine-tuned version of [moussaKam/barthez](https://huggingface.co/moussaKam/barthez) on an unknown dataset.

	Note: this model is one of the preliminary experiments and it underperforms the models published in the paper (using [MBartHez](https://huggingface.co/moussaKam/mbarthez) and HAL/Wiki pre-training + copy mechanisms)

	It achieves the following results on the evaluation set:
	- Loss: 2.0733
	- Rouge1: 37.1845
	- Rouge2: 16.9534
	- Rougel: 28.8416
	- Rougelsum: 29.077
	- Gen Len: 34.4028

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 3e-05
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 20.0
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|:-------:\|:---------:\|:-------:\|
	\| 3.4832 \| 1.0 \| 108 \| 2.4237 \| 22.6662 \| 10.009 \| 19.8729 \| 19.8814 \| 15.8333 \|
	\| 2.557 \| 2.0 \| 216 \| 2.2328 \| 24.8102 \| 11.9911 \| 20.4773 \| 20.696 \| 19.0139 \|
	\| 2.2702 \| 3.0 \| 324 \| 2.2002 \| 25.6482 \| 11.6191 \| 21.8383 \| 21.9341 \| 18.1944 \|
	\| 2.1119 \| 4.0 \| 432 \| 2.1266 \| 25.5806 \| 11.9765 \| 21.3973 \| 21.3503 \| 19.4306 \|
	\| 1.9582 \| 5.0 \| 540 \| 2.1072 \| 25.6578 \| 12.2709 \| 22.182 \| 22.0548 \| 19.1528 \|
	\| 1.8137 \| 6.0 \| 648 \| 2.1008 \| 26.5272 \| 11.4033 \| 22.359 \| 22.3259 \| 19.4722 \|
	\| 1.7725 \| 7.0 \| 756 \| 2.1074 \| 25.0405 \| 11.1773 \| 21.1369 \| 21.1847 \| 19.1806 \|
	\| 1.6772 \| 8.0 \| 864 \| 2.0959 \| 26.5237 \| 11.6028 \| 22.5018 \| 22.3931 \| 19.3333 \|
	\| 1.5798 \| 9.0 \| 972 \| 2.0976 \| 27.7443 \| 11.9898 \| 22.4052 \| 22.2954 \| 19.7222 \|
	\| 1.4753 \| 10.0 \| 1080 \| 2.0733 \| 28.3502 \| 12.9162 \| 22.6352 \| 22.6015 \| 19.8194 \|
	\| 1.4646 \| 11.0 \| 1188 \| 2.1091 \| 27.9198 \| 12.8591 \| 23.0718 \| 23.0779 \| 19.6111 \|
	\| 1.4082 \| 12.0 \| 1296 \| 2.1036 \| 28.8509 \| 13.0987 \| 23.4189 \| 23.5044 \| 19.4861 \|
	\| 1.2862 \| 13.0 \| 1404 \| 2.1222 \| 28.6641 \| 12.8157 \| 22.6799 \| 22.7051 \| 19.8611 \|
	\| 1.2612 \| 14.0 \| 1512 \| 2.1487 \| 26.9709 \| 11.6084 \| 22.0312 \| 22.0543 \| 19.875 \|
	\| 1.2327 \| 15.0 \| 1620 \| 2.1808 \| 28.218 \| 12.6239 \| 22.7372 \| 22.7881 \| 19.7361 \|
	\| 1.2264 \| 16.0 \| 1728 \| 2.1778 \| 26.7393 \| 11.4474 \| 21.6057 \| 21.555 \| 19.7639 \|
	\| 1.1848 \| 17.0 \| 1836 \| 2.1995 \| 27.6902 \| 12.1082 \| 22.0406 \| 22.0101 \| 19.6806 \|
	\| 1.133 \| 18.0 \| 1944 \| 2.2038 \| 27.0402 \| 12.1846 \| 21.7793 \| 21.7513 \| 19.8056 \|
	\| 1.168 \| 19.0 \| 2052 \| 2.2116 \| 27.5149 \| 11.9876 \| 22.1113 \| 22.1527 \| 19.7222 \|
	\| 1.1206 \| 20.0 \| 2160 \| 2.2133 \| 28.2321 \| 12.677 \| 22.749 \| 22.8485 \| 19.5972 \|


	### Framework versions

	- Transformers 4.10.2
	- Pytorch 1.7.1+cu110
	- Datasets 1.11.0
	- Tokenizers 0.10.3