Add ONNX and ORT models with quantization
Browse files- .gitattributes +4 -0
- README.md +85 -0
- README_ja.md +85 -0
- onnx_models/model.onnx +3 -0
- onnx_models/model_fp16.onnx +3 -0
- onnx_models/model_int8.onnx +3 -0
- onnx_models/model_opt.onnx +3 -0
- onnx_models/model_uint8.onnx +3 -0
- ort_models/model.ort +3 -0
- ort_models/model_fp16.ort +3 -0
- ort_models/model_int8.ort +3 -0
- ort_models/model_uint8.ort +3 -0
.gitattributes
CHANGED
@@ -33,3 +33,7 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
ort_models/model.ort filter=lfs diff=lfs merge=lfs -text
|
37 |
+
ort_models/model_fp16.ort filter=lfs diff=lfs merge=lfs -text
|
38 |
+
ort_models/model_int8.ort filter=lfs diff=lfs merge=lfs -text
|
39 |
+
ort_models/model_uint8.ort filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
tags:
|
4 |
+
- onnx
|
5 |
+
- ort
|
6 |
+
---
|
7 |
+
|
8 |
+
# ONNX and ORT models with quantization of [google-bert/bert-base-german-dbmdz-cased](https://huggingface.co/google-bert/bert-base-german-dbmdz-cased)
|
9 |
+
|
10 |
+
[日本語READMEはこちら](README_ja.md)
|
11 |
+
|
12 |
+
This repository contains the ONNX and ORT formats of the model [google-bert/bert-base-german-dbmdz-cased](https://huggingface.co/google-bert/bert-base-german-dbmdz-cased), along with quantized versions.
|
13 |
+
|
14 |
+
## License
|
15 |
+
The license for this model is "mit". For details, please refer to the original model page: [google-bert/bert-base-german-dbmdz-cased](https://huggingface.co/google-bert/bert-base-german-dbmdz-cased).
|
16 |
+
|
17 |
+
## Usage
|
18 |
+
To use this model, install ONNX Runtime and perform inference as shown below.
|
19 |
+
```python
|
20 |
+
# Example code
|
21 |
+
import onnxruntime as ort
|
22 |
+
import numpy as np
|
23 |
+
from transformers import AutoTokenizer
|
24 |
+
import os
|
25 |
+
|
26 |
+
# Load the tokenizer
|
27 |
+
tokenizer = AutoTokenizer.from_pretrained('google-bert/bert-base-german-dbmdz-cased')
|
28 |
+
|
29 |
+
# Prepare inputs
|
30 |
+
text = 'Replace this text with your input.'
|
31 |
+
inputs = tokenizer(text, return_tensors='np')
|
32 |
+
|
33 |
+
# Specify the model paths
|
34 |
+
# Test both the ONNX model and the ORT model
|
35 |
+
model_paths = [
|
36 |
+
'onnx_models/model_opt.onnx', # ONNX model
|
37 |
+
'ort_models/model.ort' # ORT format model
|
38 |
+
]
|
39 |
+
|
40 |
+
# Run inference with each model
|
41 |
+
for model_path in model_paths:
|
42 |
+
print(f'\n===== Using model: {model_path} =====')
|
43 |
+
# Get the model extension
|
44 |
+
model_extension = os.path.splitext(model_path)[1]
|
45 |
+
|
46 |
+
# Load the model
|
47 |
+
if model_extension == '.ort':
|
48 |
+
# Load the ORT format model
|
49 |
+
session = ort.InferenceSession(model_path, providers=['CPUExecutionProvider'])
|
50 |
+
else:
|
51 |
+
# Load the ONNX model
|
52 |
+
session = ort.InferenceSession(model_path)
|
53 |
+
|
54 |
+
# Run inference
|
55 |
+
outputs = session.run(None, dict(inputs))
|
56 |
+
|
57 |
+
# Display the output shapes
|
58 |
+
for idx, output in enumerate(outputs):
|
59 |
+
print(f'Output {idx} shape: {output.shape}')
|
60 |
+
|
61 |
+
# Display the results (add further processing if needed)
|
62 |
+
print(outputs)
|
63 |
+
```
|
64 |
+
|
65 |
+
## Contents of the Model
|
66 |
+
This repository includes the following models:
|
67 |
+
|
68 |
+
### ONNX Models
|
69 |
+
- `onnx_models/model.onnx`: Original ONNX model converted from [google-bert/bert-base-german-dbmdz-cased](https://huggingface.co/google-bert/bert-base-german-dbmdz-cased)
|
70 |
+
- `onnx_models/model_opt.onnx`: Optimized ONNX model
|
71 |
+
- `onnx_models/model_fp16.onnx`: FP16 quantized model
|
72 |
+
- `onnx_models/model_int8.onnx`: INT8 quantized model
|
73 |
+
- `onnx_models/model_uint8.onnx`: UINT8 quantized model
|
74 |
+
|
75 |
+
### ORT Models
|
76 |
+
- `ort_models/model.ort`: ORT model using the optimized ONNX model
|
77 |
+
- `ort_models/model_fp16.ort`: ORT model using the FP16 quantized model
|
78 |
+
- `ort_models/model_int8.ort`: ORT model using the INT8 quantized model
|
79 |
+
- `ort_models/model_uint8.ort`: ORT model using the UINT8 quantized model
|
80 |
+
|
81 |
+
## Notes
|
82 |
+
Please adhere to the license and usage conditions of the original model [google-bert/bert-base-german-dbmdz-cased](https://huggingface.co/google-bert/bert-base-german-dbmdz-cased).
|
83 |
+
|
84 |
+
## Contribution
|
85 |
+
If you find any issues or have improvements, please create an issue or submit a pull request.
|
README_ja.md
ADDED
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
tags:
|
4 |
+
- onnx
|
5 |
+
- ort
|
6 |
+
---
|
7 |
+
|
8 |
+
# [google-bert/bert-base-german-dbmdz-cased](https://huggingface.co/google-bert/bert-base-german-dbmdz-cased) のONNXおよびORTモデルと量子化モデル
|
9 |
+
|
10 |
+
[Click here for the English README](README.md)
|
11 |
+
|
12 |
+
このリポジトリは、元のモデル [google-bert/bert-base-german-dbmdz-cased](https://huggingface.co/google-bert/bert-base-german-dbmdz-cased) をONNXおよびORT形式に変換し、さらに量子化したものです。
|
13 |
+
|
14 |
+
## ライセンス
|
15 |
+
このモデルのライセンスは「mit」です。詳細は元のモデルページ([google-bert/bert-base-german-dbmdz-cased](https://huggingface.co/google-bert/bert-base-german-dbmdz-cased))を参照してください。
|
16 |
+
|
17 |
+
## 使い方
|
18 |
+
このモデルを使用するには、ONNX Runtimeをインストールし、以下のように推論を行います。
|
19 |
+
```python
|
20 |
+
# サンプルコード
|
21 |
+
import onnxruntime as ort
|
22 |
+
import numpy as np
|
23 |
+
from transformers import AutoTokenizer
|
24 |
+
import os
|
25 |
+
|
26 |
+
# トークナイザーの読み込み
|
27 |
+
tokenizer = AutoTokenizer.from_pretrained('google-bert/bert-base-german-dbmdz-cased')
|
28 |
+
|
29 |
+
# 入力の準備
|
30 |
+
text = 'ここに入力テキストを置き換えてください。'
|
31 |
+
inputs = tokenizer(text, return_tensors='np')
|
32 |
+
|
33 |
+
# 使用するモデルのパスを指定
|
34 |
+
# ONNXモデルとORTモデルの両方をテストする
|
35 |
+
model_paths = [
|
36 |
+
'onnx_models/model_opt.onnx', # ONNXモデル
|
37 |
+
'ort_models/model.ort' # ORTフォーマットのモデル
|
38 |
+
]
|
39 |
+
|
40 |
+
# モデルごとに推論を実行
|
41 |
+
for model_path in model_paths:
|
42 |
+
print(f'\n===== Using model: {model_path} =====')
|
43 |
+
# モデルの拡張子を取得
|
44 |
+
model_extension = os.path.splitext(model_path)[1]
|
45 |
+
|
46 |
+
# モデルの読み込み
|
47 |
+
if model_extension == '.ort':
|
48 |
+
# ORTフォーマットのモデルをロード
|
49 |
+
session = ort.InferenceSession(model_path, providers=['CPUExecutionProvider'])
|
50 |
+
else:
|
51 |
+
# ONNXモデルをロード
|
52 |
+
session = ort.InferenceSession(model_path)
|
53 |
+
|
54 |
+
# 推論の実行
|
55 |
+
outputs = session.run(None, dict(inputs))
|
56 |
+
|
57 |
+
# 出力の形状を表示
|
58 |
+
for idx, output in enumerate(outputs):
|
59 |
+
print(f'Output {idx} shape: {output.shape}')
|
60 |
+
|
61 |
+
# 結果の表示(必要に応じて処理を追加)
|
62 |
+
print(outputs)
|
63 |
+
```
|
64 |
+
|
65 |
+
## モデルの内容
|
66 |
+
このリポジトリには、以下のモデルが含まれています。
|
67 |
+
|
68 |
+
### ONNXモデル
|
69 |
+
- `onnx_models/model.onnx`: [google-bert/bert-base-german-dbmdz-cased](https://huggingface.co/google-bert/bert-base-german-dbmdz-cased) から変換された元のONNXモデル
|
70 |
+
- `onnx_models/model_opt.onnx`: 最適化されたONNXモデル
|
71 |
+
- `onnx_models/model_fp16.onnx`: FP16による量子化モデル
|
72 |
+
- `onnx_models/model_int8.onnx`: INT8による量子化モデル
|
73 |
+
- `onnx_models/model_uint8.onnx`: UINT8による量子化モデル
|
74 |
+
|
75 |
+
### ORTモデル
|
76 |
+
- `ort_models/model.ort`: 最適化されたONNXモデルを使用したORTモデル
|
77 |
+
- `ort_models/model_fp16.ort`: FP16量子化モデルを使用したORTモデル
|
78 |
+
- `ort_models/model_int8.ort`: INT8量子化モデルを使用したORTモデル
|
79 |
+
- `ort_models/model_uint8.ort`: UINT8量子化モデルを使用したORTモデル
|
80 |
+
|
81 |
+
## 注意事項
|
82 |
+
元のモデル [google-bert/bert-base-german-dbmdz-cased](https://huggingface.co/google-bert/bert-base-german-dbmdz-cased) のライセンスおよび使用条件を遵守してください。
|
83 |
+
|
84 |
+
## 貢献
|
85 |
+
問題や改善点があれば、Issueを作成するかプルリクエストを送ってください。
|
onnx_models/model.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:01a9c5bb806ec2fc94b763a99e04ddaed5a30d6385e39dc5feef4a5be869c34c
|
3 |
+
size 439928004
|
onnx_models/model_fp16.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1677c6add9c8844ad388a659e0580e7327e97938a8c4c6e391ec05fe664dd15f
|
3 |
+
size 220108543
|
onnx_models/model_int8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:663299fb9350702872dd811f11fa029b662834b8e82a607959477a11205c929d
|
3 |
+
size 110639304
|
onnx_models/model_opt.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:32e88e4f31d14a1f636c73478a3679b7412795508d94e417a9c4edb337c270b2
|
3 |
+
size 439902121
|
onnx_models/model_uint8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b17b232f38ede7c4582f0e12bcdb3477622324c255d7581ff70e307a36511761
|
3 |
+
size 110639337
|
ort_models/model.ort
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6649635fb6f8319148b8c20c5a2b215988ef0cee0218350cd749b549b22047b9
|
3 |
+
size 440096728
|
ort_models/model_fp16.ort
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1b30586b3cbc6745b9d6f898926bfb9e0a97697b74856b3f82377337b8cbd5ff
|
3 |
+
size 220657088
|
ort_models/model_int8.ort
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:42e5278a396adcc630f4990667acbd64601fee6828eecb79f8a95173f73b7260
|
3 |
+
size 110804192
|
ort_models/model_uint8.ort
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4ec631a211f4b15fc9c40a37262ae200609425d90c47ad4802a5412e831d416a
|
3 |
+
size 110804192
|