SimFonX commited on
Commit
54d0c7f
Β·
verified Β·
1 Parent(s): ef601af

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -50
README.md CHANGED
@@ -19,28 +19,20 @@ Optimized Whisper ONNX models packaged for easy deployment. Each zip contains al
19
 
20
  | Model | Language | Size | Target Use | Download |
21
  |-------|----------|------|------------|----------|
22
- | **Medium English** | English-only | ~486MB | High quality English transcription | [whisper-medium-en-onnx.zip](medium-en/whisper-medium-en-onnx.zip) |
23
- | **Small English** | English-only | ~85MB | Fast English transcription | [whisper-small-en-onnx.zip](small-en/whisper-small-en-onnx.zip) |
24
- | **Small Multilingual** | 99 languages | ~110MB | Fast multilingual transcription | [whisper-small-multilingual-onnx.zip](small-multilingual/whisper-small-multilingual-onnx.zip) |
25
- | **Medium Multilingual** | 99 languages | ~295MB | High quality multilingual | [whisper-medium-multilingual-onnx.zip](medium-multilingual/whisper-medium-multilingual-onnx.zip) |
26
- | **Large v3 Turbo** | 99 languages | ~530MB | Best quality, fastest large model | [whisper-large-v3-turbo-onnx.zip](large-v3-turbo/whisper-large-v3-turbo-onnx.zip) |
27
 
28
- ## Size Comparison vs GGML Q5_0
29
-
30
- All models are **smaller** than equivalent GGML Q5_0 models:
31
-
32
- - Medium English: 486MB vs 515MB GGML βœ… (-29MB)
33
- - Small models: ~85-110MB vs 182MB GGML βœ… (-70-97MB)
34
- - Large v3 Turbo: 530MB vs 574MB GGML βœ… (-44MB)
35
 
36
  ## Contents of Each Zip
37
 
38
- Each zip file contains 7 files needed for inference:
39
 
40
  ### ONNX Model Files
41
  - `encoder_model_quantized.onnx` - Audio encoder (processes mel spectrograms)
42
- - `decoder_model_merged_quantized.onnx` - Text decoder (generates transcription)
43
- - `decoder_with_past_model_quantized.onnx` - Optimized decoder with KV caching
44
 
45
  ### Configuration Files
46
  - `config.json` - Model configuration
@@ -48,41 +40,6 @@ Each zip file contains 7 files needed for inference:
48
  - `preprocessor_config.json` - Audio preprocessing settings
49
  - `tokenizer.json` - Tokenizer vocabulary
50
 
51
- ## Usage
52
-
53
- ### C# with ONNX Runtime
54
- ```csharp
55
- // Download and extract zip
56
- var modelPath = "path/to/extracted/model/";
57
-
58
- // Initialize with DirectML support
59
- var sessionOptions = new SessionOptions();
60
- sessionOptions.AppendExecutionProvider_DML(0);
61
-
62
- var encoderSession = new InferenceSession(
63
- Path.Combine(modelPath, "encoder_model_quantized.onnx"), sessionOptions);
64
- var decoderSession = new InferenceSession(
65
- Path.Combine(modelPath, "decoder_with_past_model_quantized.onnx"), sessionOptions);
66
- ```
67
-
68
- ### Python with ONNX Runtime
69
- ```python
70
- import onnxruntime as ort
71
-
72
- # Load with DirectML/CUDA support
73
- providers = ['DmlExecutionProvider', 'CPUExecutionProvider']
74
- encoder_session = ort.InferenceSession('encoder_model_quantized.onnx', providers=providers)
75
- decoder_session = ort.InferenceSession('decoder_with_past_model_quantized.onnx', providers=providers)
76
- ```
77
-
78
- ## Features
79
-
80
- βœ… **DirectML Support** - Works with any DirectX 12 GPU (AMD, Intel, NVIDIA)
81
- βœ… **CUDA Support** - Accelerated inference on NVIDIA GPUs
82
- βœ… **CPU Fallback** - Automatic fallback to CPU if GPU unavailable
83
- βœ… **Quantized** - INT8/INT4 quantization for smaller size and faster inference
84
- βœ… **Complete** - All files needed for inference included
85
-
86
  ## Model Sources
87
 
88
  These models are repackaged from:
 
19
 
20
  | Model | Language | Size | Target Use | Download |
21
  |-------|----------|------|------------|----------|
22
+ | **Small English** | English-only | 107MB | Fast English transcription | [whisper-small-en-onnx.zip](small-en/whisper-small-en-onnx.zip) |
23
+ | **Small Multilingual** | 99 languages | 245MB | Fast multilingual transcription | [whisper-small-multilingual-onnx.zip](small-multilingual/whisper-small-multilingual-onnx.zip) |
24
+ | **Medium English** | English-only | 247MB | High quality English transcription | [whisper-medium-en-onnx.zip](medium-en/whisper-medium-en-onnx.zip) |
25
+ | **Medium Multilingual** | 99 languages | 602MB | High quality multilingual | [whisper-medium-multilingual-onnx.zip](medium-multilingual/whisper-medium-multilingual-onnx.zip) |
26
+ | **Large v3 Turbo** | 99 languages | 646MB | Best quality, fastest large model | [whisper-large-v3-turbo-onnx.zip](large-v3-turbo/whisper-large-v3-turbo-onnx.zip) |
27
 
 
 
 
 
 
 
 
28
 
29
  ## Contents of Each Zip
30
 
31
+ Each zip file contains 6 files needed for inference:
32
 
33
  ### ONNX Model Files
34
  - `encoder_model_quantized.onnx` - Audio encoder (processes mel spectrograms)
35
+ - `decoder_with_past_model_quantized.onnx` - Text decoder (generates transcription), optimized decoder with KV caching
 
36
 
37
  ### Configuration Files
38
  - `config.json` - Model configuration
 
40
  - `preprocessor_config.json` - Audio preprocessing settings
41
  - `tokenizer.json` - Tokenizer vocabulary
42
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
  ## Model Sources
44
 
45
  These models are repackaged from: