langgz commited on
Commit
dcbdafe
·
1 Parent(s): 8366bd6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -3
README.md CHANGED
@@ -12,8 +12,7 @@ tags:
12
  - ASR
13
  ---
14
  ## Introduce
15
- <p align="center">
16
- <img src="./struct.png" alt="Paraformer structure" width="500" />
17
  [Paraformer](https://arxiv.org/abs/2206.08317) is a non-autoregressive end-to-end speech recognition model. Compared to the currently mainstream autoregressive models, non-autoregressive models can output the target text for the entire sentence in parallel, making them particularly suitable for parallel inference using GPUs. Paraformer is currently the first known non-autoregressive model that can achieve the same performance as autoregressive end-to-end models on industrial-scale data. When combined with GPU inference, it can improve inference efficiency by 10 times, thereby reducing machine costs for speech recognition cloud services by nearly 10 times.
18
 
19
  This repo shows how to use Paraformer with `funasr_onnx` runtime, the model comes from [FunASR](https://github.com/alibaba-damo-academy/FunASR), which trained from 60000 hours Mandarin data. The performance of Paraformer obtained the first place in [SpeechIO Leadboard](https://github.com/SpeechColab/Leaderboard).
@@ -65,4 +64,15 @@ Output: `List[str]`: recognition result
65
 
66
  ## Performance benchmark
67
 
68
- Please ref to [benchmark](https://github.com/alibaba-damo-academy/FunASR/blob/main/funasr/runtime/python/benchmark_onnx.md)
 
 
 
 
 
 
 
 
 
 
 
 
12
  - ASR
13
  ---
14
  ## Introduce
15
+
 
16
  [Paraformer](https://arxiv.org/abs/2206.08317) is a non-autoregressive end-to-end speech recognition model. Compared to the currently mainstream autoregressive models, non-autoregressive models can output the target text for the entire sentence in parallel, making them particularly suitable for parallel inference using GPUs. Paraformer is currently the first known non-autoregressive model that can achieve the same performance as autoregressive end-to-end models on industrial-scale data. When combined with GPU inference, it can improve inference efficiency by 10 times, thereby reducing machine costs for speech recognition cloud services by nearly 10 times.
17
 
18
  This repo shows how to use Paraformer with `funasr_onnx` runtime, the model comes from [FunASR](https://github.com/alibaba-damo-academy/FunASR), which trained from 60000 hours Mandarin data. The performance of Paraformer obtained the first place in [SpeechIO Leadboard](https://github.com/SpeechColab/Leaderboard).
 
64
 
65
  ## Performance benchmark
66
 
67
+ Please ref to [benchmark](https://github.com/alibaba-damo-academy/FunASR/blob/main/funasr/runtime/python/benchmark_onnx.md)
68
+
69
+ ## Citations
70
+
71
+ ``` bibtex
72
+ @inproceedings{gao2022paraformer,
73
+ title={Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition},
74
+ author={Gao, Zhifu and Zhang, Shiliang and McLoughlin, Ian and Yan, Zhijie},
75
+ booktitle={INTERSPEECH},
76
+ year={2022}
77
+ }
78
+ ```