File size: 1,008 Bytes
4b7f09e
 
 
 
 
 
 
c8d8be9
 
 
724c6a3
 
 
 
 
 
 
 
94cd182
 
 
 
724c6a3
 
 
 
 
 
 
3fe4891
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
---
license: apache-2.0
datasets:
- pongjin/en_corpora_parliament_processed
language:
- en
pipeline_tag: automatic-speech-recognition
metrics:
- wer
---
**This model has been referred to the following links**  
1) https://huggingface.co/blog/wav2vec2-with-ngram
2) https://huggingface.co/blog/fine-tune-wav2vec2-english  

Thanks to [patrickvonplaten Patrick von Platen](https://huggingface.co/patrickvonplaten)

ํ•ด๋‹น ๋ชจ๋ธ์€ ํ•œ๊ตญ์ธ์˜ ์˜์–ด ๋ฐœํ™” ์ธ์‹ ์„ฑ๋Šฅ ๊ฐœ์„ ์„ ์œ„ํ•ด facebook/wav2vec2-large-960h ๋กœ ํŒŒ์ธํŠœ๋‹ํ•œ ๋ชจ๋ธ์— KenLM 5-gram ์„ ๋ถ™์ธ ASR + LM ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

If you want to use LM, you must have kenlm installed https://github.com/kpu/kenlm  
```python
pip install https://github.com/kpu/kenlm/archive/master.zip
```
ํ•™์Šต ๋ฐ์ดํ„ฐ ์ถœ์ฒ˜ : https://aiopen.etri.re.kr/voiceModel

>transformers==4.24.0  
>huggingface_hub==0.13.2

 | wer | epoch | batch | lr | weight_decay| warmup_steps|
 | --- | --- | --- | --- | --- | --- |
 | 0.17 | 10 | 16 | 1e-4 | 0.005 | 1000 |