juanfkurucz
commited on
Commit
•
596bfe9
1
Parent(s):
2d22c94
Add blog link
Browse files
README.md
CHANGED
@@ -15,6 +15,8 @@ metrics:
|
|
15 |
|
16 |
This is an optimized model using [madlag/bert-large-uncased-wwm-squadv2-x2.63-f82.6-d16-hybrid-v1](https://huggingface.co/madlag/bert-large-uncased-wwm-squadv2-x2.63-f82.6-d16-hybrid-v1) as the base model which was created using the [nn_pruning](https://github.com/huggingface/nn_pruning) python library. This is a pruned model of [madlag/bert-large-uncased-whole-word-masking-finetuned-squadv2](https://huggingface.co/madlag/bert-large-uncased-whole-word-masking-finetuned-squadv2)
|
17 |
|
|
|
|
|
18 |
Our final optimized model weighs **579 MB**, has an inference speed of **18.184 ms** on a Tesla T4 and has a performance of **82.68%** best F1. Below there is a comparison for each base model:
|
19 |
|
20 |
| Model | Weight | Throughput on Tesla T4 | Best F1 |
|
|
|
15 |
|
16 |
This is an optimized model using [madlag/bert-large-uncased-wwm-squadv2-x2.63-f82.6-d16-hybrid-v1](https://huggingface.co/madlag/bert-large-uncased-wwm-squadv2-x2.63-f82.6-d16-hybrid-v1) as the base model which was created using the [nn_pruning](https://github.com/huggingface/nn_pruning) python library. This is a pruned model of [madlag/bert-large-uncased-whole-word-masking-finetuned-squadv2](https://huggingface.co/madlag/bert-large-uncased-whole-word-masking-finetuned-squadv2)
|
17 |
|
18 |
+
Feel free to read our blog about how we optimized this model [(link)](https://tryolabs.com/blog/2022/11/24/transformer-based-model-for-faster-inference)
|
19 |
+
|
20 |
Our final optimized model weighs **579 MB**, has an inference speed of **18.184 ms** on a Tesla T4 and has a performance of **82.68%** best F1. Below there is a comparison for each base model:
|
21 |
|
22 |
| Model | Weight | Throughput on Tesla T4 | Best F1 |
|