GetmanY1 commited on
Commit
0226760
1 Parent(s): 40041d1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -7,9 +7,9 @@ tags:
7
  - finnish
8
  - pretraining
9
  ---
10
- # Finnish Wav2vec2-Base
11
 
12
- The base model pre-trained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
13
 
14
  **Note**: This model does not have a tokenizer as it was pre-trained on audio alone. In order to use this model **speech recognition**, a tokenizer should be created and the model should be fine-tuned on labeled text data. Check out [this blog](https://huggingface.co/blog/fine-tune-xlsr-wav2vec2) for more in-detail explanation of how to fine-tune the model.
15
 
@@ -17,7 +17,7 @@ The base model pre-trained on 16kHz sampled speech audio. When using the model m
17
 
18
  ## Model description
19
 
20
- The Finnish Wav2Vec2 Base has the same architecture and uses the same training objective as the English and multilingual one described in [Paper](https://arxiv.org/abs/2006.11477). It is pre-trained on 158k hours of unlabeled Finnish speech, including [KAVI radio and television archive materials](https://kavi.fi/en/radio-ja-televisioarkistointia-vuodesta-2008/), Lahjoita puhetta (Donate Speech), Finnish Parliament, Finnish VoxPopuli.
21
 
22
  You can read more about the pre-trained model from [this paper](TODO).
23
 
 
7
  - finnish
8
  - pretraining
9
  ---
10
+ # Finnish Wav2vec2-Large
11
 
12
+ The large model pre-trained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
13
 
14
  **Note**: This model does not have a tokenizer as it was pre-trained on audio alone. In order to use this model **speech recognition**, a tokenizer should be created and the model should be fine-tuned on labeled text data. Check out [this blog](https://huggingface.co/blog/fine-tune-xlsr-wav2vec2) for more in-detail explanation of how to fine-tune the model.
15
 
 
17
 
18
  ## Model description
19
 
20
+ The Finnish Wav2Vec2 Large has the same architecture and uses the same training objective as the English and multilingual one described in [Paper](https://arxiv.org/abs/2006.11477). It is pre-trained on 158k hours of unlabeled Finnish speech, including [KAVI radio and television archive materials](https://kavi.fi/en/radio-ja-televisioarkistointia-vuodesta-2008/), Lahjoita puhetta (Donate Speech), Finnish Parliament, Finnish VoxPopuli.
21
 
22
  You can read more about the pre-trained model from [this paper](TODO).
23