cdactvm
/

w2v-bert-tamil_new

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

w2v-bert-tamil_new / README.md

cdactvm's picture

End of training

72bbbd5 verified 3 months ago

|

history blame contribute delete

3.41 kB

	---
	license: mit
	base_model: facebook/w2v-bert-2.0
	tags:
	- generated_from_trainer
	metrics:
	- wer
	model-index:
	- name: w2v-bert-tamil_new
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# w2v-bert-tamil_new

	This model is a fine-tuned version of [facebook/w2v-bert-2.0](https://huggingface.co/facebook/w2v-bert-2.0) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.0960
	- Wer: 0.1781

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 4e-05
	- train_batch_size: 2
	- eval_batch_size: 1
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 8
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 2000
	- num_epochs: 5
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Wer \|
	\|:-------------:\|:------:\|:-----:\|:---------------:\|:------:\|
	\| 0.3099 \| 0.1547 \| 2000 \| 0.2685 \| 0.4726 \|
	\| 0.2319 \| 0.3094 \| 4000 \| 0.2052 \| 0.3246 \|
	\| 0.21 \| 0.4640 \| 6000 \| 0.1702 \| 0.2968 \|
	\| 0.1907 \| 0.6187 \| 8000 \| 0.1591 \| 0.2809 \|
	\| 0.1789 \| 0.7734 \| 10000 \| 0.1468 \| 0.2703 \|
	\| 0.1626 \| 0.9281 \| 12000 \| 0.1482 \| 0.2540 \|
	\| 0.1469 \| 1.0828 \| 14000 \| 0.1390 \| 0.2536 \|
	\| 0.144 \| 1.2375 \| 16000 \| 0.1298 \| 0.2433 \|
	\| 0.1418 \| 1.3921 \| 18000 \| 0.1287 \| 0.2399 \|
	\| 0.1349 \| 1.5468 \| 20000 \| 0.1219 \| 0.2343 \|
	\| 0.1266 \| 1.7015 \| 22000 \| 0.1229 \| 0.2349 \|
	\| 0.1257 \| 1.8562 \| 24000 \| 0.1202 \| 0.2241 \|
	\| 0.1209 \| 2.0109 \| 26000 \| 0.1193 \| 0.2176 \|
	\| 0.1113 \| 2.1655 \| 28000 \| 0.1146 \| 0.2150 \|
	\| 0.1052 \| 2.3202 \| 30000 \| 0.1165 \| 0.2234 \|
	\| 0.103 \| 2.4749 \| 32000 \| 0.1130 \| 0.2112 \|
	\| 0.0988 \| 2.6296 \| 34000 \| 0.1092 \| 0.2029 \|
	\| 0.098 \| 2.7843 \| 36000 \| 0.1061 \| 0.2022 \|
	\| 0.1007 \| 2.9390 \| 38000 \| 0.1054 \| 0.2036 \|
	\| 0.0823 \| 3.0936 \| 40000 \| 0.1042 \| 0.1997 \|
	\| 0.0866 \| 3.2483 \| 42000 \| 0.1020 \| 0.1945 \|
	\| 0.0874 \| 3.4030 \| 44000 \| 0.0993 \| 0.1972 \|
	\| 0.0825 \| 3.5577 \| 46000 \| 0.1012 \| 0.1941 \|
	\| 0.083 \| 3.7124 \| 48000 \| 0.1017 \| 0.1911 \|
	\| 0.0724 \| 3.8671 \| 50000 \| 0.0992 \| 0.1904 \|
	\| 0.0761 \| 4.0217 \| 52000 \| 0.0983 \| 0.1856 \|
	\| 0.0641 \| 4.1764 \| 54000 \| 0.1011 \| 0.1857 \|
	\| 0.0611 \| 4.3311 \| 56000 \| 0.0980 \| 0.1821 \|
	\| 0.0646 \| 4.4858 \| 58000 \| 0.0982 \| 0.1816 \|
	\| 0.062 \| 4.6405 \| 60000 \| 0.0962 \| 0.1786 \|
	\| 0.0616 \| 4.7951 \| 62000 \| 0.0951 \| 0.1787 \|
	\| 0.0607 \| 4.9498 \| 64000 \| 0.0960 \| 0.1781 \|


	### Framework versions

	- Transformers 4.41.1
	- Pytorch 2.1.2+cu121
	- Datasets 2.19.1
	- Tokenizers 0.19.1