YC-Li
/

Sequence-to-Sequence-ASR-Error-Correction

Error Correction

Model card Files Files and versions Community

YC-Li commited on May 22, 2024

Commit

520b962

·

verified ·

1 Parent(s): ca94c75

Update README.md

Files changed (1) hide show

README.md +30 -1

README.md CHANGED Viewed

@@ -4,8 +4,37 @@ language:
 metrics:
 - wer
 - bleu
 tags:
 - ASR
 - Error Correction
 - Crossmodal
----

 metrics:
 - wer
 - bleu
+- google_bleu
 tags:
 - ASR
 - Error Correction
 - Crossmodal
+---
+### Model Description
+Pre-Training Settings:
+166k samples from Common Voice 13.0 was recognized by Whisper tiny.en.
+1,000 random samples was selected as the test set, and the rest for training and validation with an 80%-20% split
+- Batch size: 256
+- Initial learning rate: 1e-5
+- Adam optimizer
+- 30 epochs
+- Cross-entropy loss
+- Best checkpoint saved based on WER as the evaluation metric
+- Decoding is performed using beam search with a size of 5
+- S2S backbone model adopted from ''[Exploring data augmentation for code generation tasks](https://aclanthology.org/2023.findings-eacl.114/)''.
+Continue-Training Setting:
+- 2 epochs for gold-gold to prevent the over-correction problem on ''[Ted talk data](https://cris.fbk.eu/bitstream/11582/104409/1/WIT3-EAMT2012.pdf)''