Add ONNX and ORT models with quantization
Browse files- .gitattributes +4 -0
- README.md +85 -0
- README_ja.md +85 -0
- onnx_models/model.onnx +3 -0
- onnx_models/model_fp16.onnx +3 -0
- onnx_models/model_int8.onnx +3 -0
- onnx_models/model_opt.onnx +3 -0
- onnx_models/model_uint8.onnx +3 -0
- ort_models/model.ort +3 -0
- ort_models/model_fp16.ort +3 -0
- ort_models/model_int8.ort +3 -0
- ort_models/model_uint8.ort +3 -0
.gitattributes
CHANGED
@@ -33,3 +33,7 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
ort_models/model.ort filter=lfs diff=lfs merge=lfs -text
|
37 |
+
ort_models/model_fp16.ort filter=lfs diff=lfs merge=lfs -text
|
38 |
+
ort_models/model_int8.ort filter=lfs diff=lfs merge=lfs -text
|
39 |
+
ort_models/model_uint8.ort filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
tags:
|
4 |
+
- onnx
|
5 |
+
- ort
|
6 |
+
---
|
7 |
+
|
8 |
+
# ONNX and ORT models with quantization of [answerdotai/answerai-colbert-small-v1](https://huggingface.co/answerdotai/answerai-colbert-small-v1)
|
9 |
+
|
10 |
+
[日本語READMEはこちら](README_ja.md)
|
11 |
+
|
12 |
+
This repository contains the ONNX and ORT formats of the model [answerdotai/answerai-colbert-small-v1](https://huggingface.co/answerdotai/answerai-colbert-small-v1), along with quantized versions.
|
13 |
+
|
14 |
+
## License
|
15 |
+
The license for this model is "apache-2.0". For details, please refer to the original model page: [answerdotai/answerai-colbert-small-v1](https://huggingface.co/answerdotai/answerai-colbert-small-v1).
|
16 |
+
|
17 |
+
## Usage
|
18 |
+
To use this model, install ONNX Runtime and perform inference as shown below.
|
19 |
+
```python
|
20 |
+
# Example code
|
21 |
+
import onnxruntime as ort
|
22 |
+
import numpy as np
|
23 |
+
from transformers import AutoTokenizer
|
24 |
+
import os
|
25 |
+
|
26 |
+
# Load the tokenizer
|
27 |
+
tokenizer = AutoTokenizer.from_pretrained('answerdotai/answerai-colbert-small-v1')
|
28 |
+
|
29 |
+
# Prepare inputs
|
30 |
+
text = 'Replace this text with your input.'
|
31 |
+
inputs = tokenizer(text, return_tensors='np')
|
32 |
+
|
33 |
+
# Specify the model paths
|
34 |
+
# Test both the ONNX model and the ORT model
|
35 |
+
model_paths = [
|
36 |
+
'onnx_models/model_opt.onnx', # ONNX model
|
37 |
+
'ort_models/model.ort' # ORT format model
|
38 |
+
]
|
39 |
+
|
40 |
+
# Run inference with each model
|
41 |
+
for model_path in model_paths:
|
42 |
+
print(f'\n===== Using model: {model_path} =====')
|
43 |
+
# Get the model extension
|
44 |
+
model_extension = os.path.splitext(model_path)[1]
|
45 |
+
|
46 |
+
# Load the model
|
47 |
+
if model_extension == '.ort':
|
48 |
+
# Load the ORT format model
|
49 |
+
session = ort.InferenceSession(model_path, providers=['CPUExecutionProvider'])
|
50 |
+
else:
|
51 |
+
# Load the ONNX model
|
52 |
+
session = ort.InferenceSession(model_path)
|
53 |
+
|
54 |
+
# Run inference
|
55 |
+
outputs = session.run(None, dict(inputs))
|
56 |
+
|
57 |
+
# Display the output shapes
|
58 |
+
for idx, output in enumerate(outputs):
|
59 |
+
print(f'Output {idx} shape: {output.shape}')
|
60 |
+
|
61 |
+
# Display the results (add further processing if needed)
|
62 |
+
print(outputs)
|
63 |
+
```
|
64 |
+
|
65 |
+
## Contents of the Model
|
66 |
+
This repository includes the following models:
|
67 |
+
|
68 |
+
### ONNX Models
|
69 |
+
- `onnx_models/model.onnx`: Original ONNX model converted from [answerdotai/answerai-colbert-small-v1](https://huggingface.co/answerdotai/answerai-colbert-small-v1)
|
70 |
+
- `onnx_models/model_opt.onnx`: Optimized ONNX model
|
71 |
+
- `onnx_models/model_fp16.onnx`: FP16 quantized model
|
72 |
+
- `onnx_models/model_int8.onnx`: INT8 quantized model
|
73 |
+
- `onnx_models/model_uint8.onnx`: UINT8 quantized model
|
74 |
+
|
75 |
+
### ORT Models
|
76 |
+
- `ort_models/model.ort`: ORT model using the optimized ONNX model
|
77 |
+
- `ort_models/model_fp16.ort`: ORT model using the FP16 quantized model
|
78 |
+
- `ort_models/model_int8.ort`: ORT model using the INT8 quantized model
|
79 |
+
- `ort_models/model_uint8.ort`: ORT model using the UINT8 quantized model
|
80 |
+
|
81 |
+
## Notes
|
82 |
+
Please adhere to the license and usage conditions of the original model [answerdotai/answerai-colbert-small-v1](https://huggingface.co/answerdotai/answerai-colbert-small-v1).
|
83 |
+
|
84 |
+
## Contribution
|
85 |
+
If you find any issues or have improvements, please create an issue or submit a pull request.
|
README_ja.md
ADDED
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
tags:
|
4 |
+
- onnx
|
5 |
+
- ort
|
6 |
+
---
|
7 |
+
|
8 |
+
# [answerdotai/answerai-colbert-small-v1](https://huggingface.co/answerdotai/answerai-colbert-small-v1) のONNXおよびORTモデルと量子化モデル
|
9 |
+
|
10 |
+
[Click here for the English README](README.md)
|
11 |
+
|
12 |
+
このリポジトリは、元のモデル [answerdotai/answerai-colbert-small-v1](https://huggingface.co/answerdotai/answerai-colbert-small-v1) をONNXおよびORT形式に変換し、さらに量子化したものです。
|
13 |
+
|
14 |
+
## ライセンス
|
15 |
+
このモデルのライセンスは「apache-2.0」です。詳細は元のモデルページ([answerdotai/answerai-colbert-small-v1](https://huggingface.co/answerdotai/answerai-colbert-small-v1))を参照してください。
|
16 |
+
|
17 |
+
## 使い方
|
18 |
+
このモデルを使用するには、ONNX Runtimeをインストールし、以下のように推論を行います。
|
19 |
+
```python
|
20 |
+
# サンプルコード
|
21 |
+
import onnxruntime as ort
|
22 |
+
import numpy as np
|
23 |
+
from transformers import AutoTokenizer
|
24 |
+
import os
|
25 |
+
|
26 |
+
# トークナイザーの読み込み
|
27 |
+
tokenizer = AutoTokenizer.from_pretrained('answerdotai/answerai-colbert-small-v1')
|
28 |
+
|
29 |
+
# 入力の準備
|
30 |
+
text = 'ここに入力テキストを置き換えてください。'
|
31 |
+
inputs = tokenizer(text, return_tensors='np')
|
32 |
+
|
33 |
+
# 使用するモデルのパスを指定
|
34 |
+
# ONNXモデルとORTモデルの両方をテストする
|
35 |
+
model_paths = [
|
36 |
+
'onnx_models/model_opt.onnx', # ONNXモデル
|
37 |
+
'ort_models/model.ort' # ORTフォーマットのモデル
|
38 |
+
]
|
39 |
+
|
40 |
+
# モデルごとに推論を実行
|
41 |
+
for model_path in model_paths:
|
42 |
+
print(f'\n===== Using model: {model_path} =====')
|
43 |
+
# モデルの拡張子を取得
|
44 |
+
model_extension = os.path.splitext(model_path)[1]
|
45 |
+
|
46 |
+
# モデルの読み込み
|
47 |
+
if model_extension == '.ort':
|
48 |
+
# ORTフォーマットのモデルをロード
|
49 |
+
session = ort.InferenceSession(model_path, providers=['CPUExecutionProvider'])
|
50 |
+
else:
|
51 |
+
# ONNXモデルをロード
|
52 |
+
session = ort.InferenceSession(model_path)
|
53 |
+
|
54 |
+
# 推論の実行
|
55 |
+
outputs = session.run(None, dict(inputs))
|
56 |
+
|
57 |
+
# 出力の形状を表示
|
58 |
+
for idx, output in enumerate(outputs):
|
59 |
+
print(f'Output {idx} shape: {output.shape}')
|
60 |
+
|
61 |
+
# 結果の表示(必要に応じて処理を追加)
|
62 |
+
print(outputs)
|
63 |
+
```
|
64 |
+
|
65 |
+
## モデルの内容
|
66 |
+
このリポジトリには、以下のモデルが含まれています。
|
67 |
+
|
68 |
+
### ONNXモデル
|
69 |
+
- `onnx_models/model.onnx`: [answerdotai/answerai-colbert-small-v1](https://huggingface.co/answerdotai/answerai-colbert-small-v1) から変換された元のONNXモデル
|
70 |
+
- `onnx_models/model_opt.onnx`: 最適化されたONNXモデル
|
71 |
+
- `onnx_models/model_fp16.onnx`: FP16による量子化モデル
|
72 |
+
- `onnx_models/model_int8.onnx`: INT8による量子化モデル
|
73 |
+
- `onnx_models/model_uint8.onnx`: UINT8による量子化モデル
|
74 |
+
|
75 |
+
### ORTモデル
|
76 |
+
- `ort_models/model.ort`: 最適化されたONNXモデルを使用したORTモデル
|
77 |
+
- `ort_models/model_fp16.ort`: FP16量子化モデルを使用したORTモデル
|
78 |
+
- `ort_models/model_int8.ort`: INT8量子化モデルを使用したORTモデル
|
79 |
+
- `ort_models/model_uint8.ort`: UINT8量子化モデルを使用したORTモデル
|
80 |
+
|
81 |
+
## 注意事項
|
82 |
+
元のモデル [answerdotai/answerai-colbert-small-v1](https://huggingface.co/answerdotai/answerai-colbert-small-v1) のライセンスおよび使用条件を遵守してください。
|
83 |
+
|
84 |
+
## 貢献
|
85 |
+
問題や改善点があれば、Issueを作成するかプルリクエストを送ってください。
|
onnx_models/model.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:473067fbbcd0ea0a064ab285cdf08ef5ed041b4c6b7b3aa7204e5281677c0676
|
3 |
+
size 133657185
|
onnx_models/model_fp16.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:dfce402c5ae54479e681c49b69f9ea02e42960dac97a0b8696e5ac166ced7d03
|
3 |
+
size 66973135
|
onnx_models/model_int8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d35bc752cb8d6ef0ccbb4677d0da6fb6cb5d0fac3f14ad43d383f3c5d9389712
|
3 |
+
size 33888405
|
onnx_models/model_opt.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:dba6793cc6aa2d391557d2059ef9de6dc0c89aa9b8d341c64a12e4503e3bb7a4
|
3 |
+
size 133631302
|
onnx_models/model_uint8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:122c96226d10e1f66bde55e3419591aa69d39de2b4bd651e7631cd928c258a95
|
3 |
+
size 33888433
|
ort_models/model.ort
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f2e1d4b7c3d6edf0090cbd9a388041f3f7956a2acb24e7c5dd9a22bae64ed955
|
3 |
+
size 133826008
|
ort_models/model_fp16.ort
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5b2973ddbc4bb7b01a13187da707d26818f9811a7eeef185c49662810121bc56
|
3 |
+
size 67521728
|
ort_models/model_int8.ort
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a32d38a327e6a6982e57ae73325204473d2adcb6560f63ea179b923f0c9baebc
|
3 |
+
size 34053344
|
ort_models/model_uint8.ort
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:546b637bf7205bf24f4d7de714cf6ea0351a610edf8b9af59a7d5e42ced7ff19
|
3 |
+
size 34053344
|