Automatic Speech Recognition
audio
ericchin commited on
Commit
005723d
1 Parent(s): 707a130

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -7
README.md CHANGED
@@ -123,10 +123,13 @@ Whisper is a Transformer based encoder-decoder model, also referred to as a sequ
123
 
124
  | Model Type | Parameters | n_audio_ctx | n_audio_state | n_audio_head | n_audio_layer | n_text_ctx | n_text_state | n_text_head | n_text_layer | n_mels | n_vocab |
125
  |---------------------------|------------|-------------|---------------|--------------|---------------|------------|--------------|-------------|--------------|--------|---------|
126
- | whisper_tiny | 39 M | 1500 | 384 | 6 | 4 | 224 | 384 | 6 | 4 | 80 | 51864 |
127
- | whisper_base | 74 M | 1500 | 512 | 8 | 6 | 224 | 512 | 8 | 6 | 80 | 51864 |
128
- | **whisper_small** | 244 M | 1500 | 768 | 12 | 12 | 224 | 768 | 12 | 12 | 80 | 51864 |
129
- | whisper_medium | 769 M | 1500 | 1024 | 16 | 24 | 224 | 1024 | 16 | 16 | 80 | 51864 |
130
- | whisper_large_v1 | 1550 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 20 | 80 | 51864 |
131
- | whisper_large_v2 | 1550 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 20 | 80 | 51864 |
132
- | whisper_large_v3 | 1550 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 20 | 128 | 51864 |
 
 
 
 
123
 
124
  | Model Type | Parameters | n_audio_ctx | n_audio_state | n_audio_head | n_audio_layer | n_text_ctx | n_text_state | n_text_head | n_text_layer | n_mels | n_vocab |
125
  |---------------------------|------------|-------------|---------------|--------------|---------------|------------|--------------|-------------|--------------|--------|---------|
126
+ | whisper-tiny | 39 M | 1500 | 384 | 6 | 4 | 224 | 384 | 6 | 4 | 80 | 51864 |
127
+ | whisper-base | 74 M | 1500 | 512 | 8 | 6 | 224 | 512 | 8 | 6 | 80 | 51864 |
128
+ | **whisper-small** | 244 M | 1500 | 768 | 12 | 12 | 224 | 768 | 12 | 12 | 80 | 51864 |
129
+ | whisper-medium | 769 M | 1500 | 1024 | 16 | 24 | 224 | 1024 | 16 | 16 | 80 | 51864 |
130
+ | whisper-large-v1 | 1550 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 20 | 80 | 51864 |
131
+ | whisper-large-v2 | 1550 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 20 | 80 | 51864 |
132
+ | whisper-distil-large-v2 | 756 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 2 | 80 | 51864 |
133
+ | whisper-large-v3 | 1550 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 20 | 128 | 51865 |
134
+ | whisper-distil-large-v3 | 756 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 2 | 128 | 51865 |
135
+ | whisper-large-v3-turbo | 809 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 4 | 128 | 51865 |