Speed benchmark

#3
by pritam - opened

I would like to know if this model is actually faster than the original model. Could you put some relevant benchmarks in the readme?

Large-v3 models on GPU

Implementation Precision Beam size Time Max. GPU memory Max. CPU memory WER %
openai/whisper-large-v3 fp16 5 2m23s MB MB
openai/whisper-turbo fp16 5 39s MB MB
faster-whisper fp16 5 52.023s 4521MB 901MB 2.883
faster-whisper int8 5 52.639s 2953MB 2261MB 4.594
faster-distil-large-v3 fp16 5 26.126s 2409MB 900MB 2.392
faster-distil-large-v3 int8 5 22.537s 1481MB 1468MB 2.392
faster-large-v3-turbo fp16 5 19.155s 2537MB 899MB 1.919
faster-large-v3-turbo int8 5 19.591s 1545MB 1526MB 1.919

WER on librispeech clean val split.
GPU GeForce RTX 2080 Ti 11GB

Sign up or log in to comment