bert-25 / README.md

hung200504

bert-cased

406bdb7 over 1 year ago

4.11 kB

	---
	license: cc-by-4.0
	base_model: deepset/bert-base-cased-squad2
	tags:
	- generated_from_trainer
	model-index:
	- name: bert-25
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# bert-25

	This model is a fine-tuned version of [deepset/bert-base-cased-squad2](https://huggingface.co/deepset/bert-base-cased-squad2) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 10.9150

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 32
	- eval_batch_size: 32
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 10

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 11.2284 \| 0.18 \| 5 \| 12.3262 \|
	\| 10.9876 \| 0.36 \| 10 \| 12.2748 \|
	\| 11.1442 \| 0.54 \| 15 \| 12.2245 \|
	\| 10.9113 \| 0.71 \| 20 \| 12.1755 \|
	\| 10.8104 \| 0.89 \| 25 \| 12.1267 \|
	\| 10.6362 \| 1.07 \| 30 \| 12.0793 \|
	\| 10.8187 \| 1.25 \| 35 \| 12.0330 \|
	\| 10.7052 \| 1.43 \| 40 \| 11.9875 \|
	\| 10.6594 \| 1.61 \| 45 \| 11.9432 \|
	\| 10.6863 \| 1.79 \| 50 \| 11.8997 \|
	\| 10.7858 \| 1.96 \| 55 \| 11.8569 \|
	\| 10.626 \| 2.14 \| 60 \| 11.8158 \|
	\| 10.4246 \| 2.32 \| 65 \| 11.7756 \|
	\| 10.3939 \| 2.5 \| 70 \| 11.7359 \|
	\| 10.7641 \| 2.68 \| 75 \| 11.6970 \|
	\| 10.341 \| 2.86 \| 80 \| 11.6597 \|
	\| 10.3492 \| 3.04 \| 85 \| 11.6228 \|
	\| 10.797 \| 3.21 \| 90 \| 11.5867 \|
	\| 10.3496 \| 3.39 \| 95 \| 11.5514 \|
	\| 10.1967 \| 3.57 \| 100 \| 11.5177 \|
	\| 10.4702 \| 3.75 \| 105 \| 11.4843 \|
	\| 10.3715 \| 3.93 \| 110 \| 11.4521 \|
	\| 10.1039 \| 4.11 \| 115 \| 11.4213 \|
	\| 10.1126 \| 4.29 \| 120 \| 11.3915 \|
	\| 9.9939 \| 4.46 \| 125 \| 11.3625 \|
	\| 10.1773 \| 4.64 \| 130 \| 11.3342 \|
	\| 10.062 \| 4.82 \| 135 \| 11.3068 \|
	\| 10.2641 \| 5.0 \| 140 \| 11.2806 \|
	\| 10.2323 \| 5.18 \| 145 \| 11.2554 \|
	\| 10.037 \| 5.36 \| 150 \| 11.2309 \|
	\| 10.0938 \| 5.54 \| 155 \| 11.2069 \|
	\| 9.8816 \| 5.71 \| 160 \| 11.1845 \|
	\| 10.124 \| 5.89 \| 165 \| 11.1625 \|
	\| 9.873 \| 6.07 \| 170 \| 11.1416 \|
	\| 9.7348 \| 6.25 \| 175 \| 11.1220 \|
	\| 9.9028 \| 6.43 \| 180 \| 11.1028 \|
	\| 9.997 \| 6.61 \| 185 \| 11.0846 \|
	\| 9.9333 \| 6.79 \| 190 \| 11.0676 \|
	\| 9.9954 \| 6.96 \| 195 \| 11.0511 \|
	\| 10.311 \| 7.14 \| 200 \| 11.0356 \|
	\| 9.7617 \| 7.32 \| 205 \| 11.0213 \|
	\| 10.0068 \| 7.5 \| 210 \| 11.0075 \|
	\| 9.6182 \| 7.68 \| 215 \| 10.9949 \|
	\| 9.7642 \| 7.86 \| 220 \| 10.9835 \|
	\| 9.8524 \| 8.04 \| 225 \| 10.9728 \|
	\| 9.7615 \| 8.21 \| 230 \| 10.9630 \|
	\| 9.7559 \| 8.39 \| 235 \| 10.9542 \|
	\| 9.5819 \| 8.57 \| 240 \| 10.9461 \|
	\| 9.5843 \| 8.75 \| 245 \| 10.9392 \|
	\| 10.05 \| 8.93 \| 250 \| 10.9331 \|
	\| 10.0722 \| 9.11 \| 255 \| 10.9276 \|
	\| 9.665 \| 9.29 \| 260 \| 10.9233 \|
	\| 9.7631 \| 9.46 \| 265 \| 10.9197 \|
	\| 9.7963 \| 9.64 \| 270 \| 10.9172 \|
	\| 9.9692 \| 9.82 \| 275 \| 10.9155 \|
	\| 9.885 \| 10.0 \| 280 \| 10.9150 \|


	### Framework versions

	- Transformers 4.34.1
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.5
	- Tokenizers 0.14.1

	---
	license: cc-by-4.0
	base_model: deepset/bert-base-cased-squad2
	tags:
	- generated_from_trainer
	model-index:
	- name: bert-25
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# bert-25

	This model is a fine-tuned version of [deepset/bert-base-cased-squad2](https://huggingface.co/deepset/bert-base-cased-squad2) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 10.9150

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 32
	- eval_batch_size: 32
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 10

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 11.2284 \| 0.18 \| 5 \| 12.3262 \|
	\| 10.9876 \| 0.36 \| 10 \| 12.2748 \|
	\| 11.1442 \| 0.54 \| 15 \| 12.2245 \|
	\| 10.9113 \| 0.71 \| 20 \| 12.1755 \|
	\| 10.8104 \| 0.89 \| 25 \| 12.1267 \|
	\| 10.6362 \| 1.07 \| 30 \| 12.0793 \|
	\| 10.8187 \| 1.25 \| 35 \| 12.0330 \|
	\| 10.7052 \| 1.43 \| 40 \| 11.9875 \|
	\| 10.6594 \| 1.61 \| 45 \| 11.9432 \|
	\| 10.6863 \| 1.79 \| 50 \| 11.8997 \|
	\| 10.7858 \| 1.96 \| 55 \| 11.8569 \|
	\| 10.626 \| 2.14 \| 60 \| 11.8158 \|
	\| 10.4246 \| 2.32 \| 65 \| 11.7756 \|
	\| 10.3939 \| 2.5 \| 70 \| 11.7359 \|
	\| 10.7641 \| 2.68 \| 75 \| 11.6970 \|
	\| 10.341 \| 2.86 \| 80 \| 11.6597 \|
	\| 10.3492 \| 3.04 \| 85 \| 11.6228 \|
	\| 10.797 \| 3.21 \| 90 \| 11.5867 \|
	\| 10.3496 \| 3.39 \| 95 \| 11.5514 \|
	\| 10.1967 \| 3.57 \| 100 \| 11.5177 \|
	\| 10.4702 \| 3.75 \| 105 \| 11.4843 \|
	\| 10.3715 \| 3.93 \| 110 \| 11.4521 \|
	\| 10.1039 \| 4.11 \| 115 \| 11.4213 \|
	\| 10.1126 \| 4.29 \| 120 \| 11.3915 \|
	\| 9.9939 \| 4.46 \| 125 \| 11.3625 \|
	\| 10.1773 \| 4.64 \| 130 \| 11.3342 \|
	\| 10.062 \| 4.82 \| 135 \| 11.3068 \|
	\| 10.2641 \| 5.0 \| 140 \| 11.2806 \|
	\| 10.2323 \| 5.18 \| 145 \| 11.2554 \|
	\| 10.037 \| 5.36 \| 150 \| 11.2309 \|
	\| 10.0938 \| 5.54 \| 155 \| 11.2069 \|
	\| 9.8816 \| 5.71 \| 160 \| 11.1845 \|
	\| 10.124 \| 5.89 \| 165 \| 11.1625 \|
	\| 9.873 \| 6.07 \| 170 \| 11.1416 \|
	\| 9.7348 \| 6.25 \| 175 \| 11.1220 \|
	\| 9.9028 \| 6.43 \| 180 \| 11.1028 \|
	\| 9.997 \| 6.61 \| 185 \| 11.0846 \|
	\| 9.9333 \| 6.79 \| 190 \| 11.0676 \|
	\| 9.9954 \| 6.96 \| 195 \| 11.0511 \|
	\| 10.311 \| 7.14 \| 200 \| 11.0356 \|
	\| 9.7617 \| 7.32 \| 205 \| 11.0213 \|
	\| 10.0068 \| 7.5 \| 210 \| 11.0075 \|
	\| 9.6182 \| 7.68 \| 215 \| 10.9949 \|
	\| 9.7642 \| 7.86 \| 220 \| 10.9835 \|
	\| 9.8524 \| 8.04 \| 225 \| 10.9728 \|
	\| 9.7615 \| 8.21 \| 230 \| 10.9630 \|
	\| 9.7559 \| 8.39 \| 235 \| 10.9542 \|
	\| 9.5819 \| 8.57 \| 240 \| 10.9461 \|
	\| 9.5843 \| 8.75 \| 245 \| 10.9392 \|
	\| 10.05 \| 8.93 \| 250 \| 10.9331 \|
	\| 10.0722 \| 9.11 \| 255 \| 10.9276 \|
	\| 9.665 \| 9.29 \| 260 \| 10.9233 \|
	\| 9.7631 \| 9.46 \| 265 \| 10.9197 \|
	\| 9.7963 \| 9.64 \| 270 \| 10.9172 \|
	\| 9.9692 \| 9.82 \| 275 \| 10.9155 \|
	\| 9.885 \| 10.0 \| 280 \| 10.9150 \|


	### Framework versions

	- Transformers 4.34.1
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.5
	- Tokenizers 0.14.1