Add ONNX and ORT models with quantization

Browse files

Files changed (10) hide show

README.md +85 -85
README_ja.md +85 -85
onnx_models/model_fp16.onnx +1 -1
onnx_models/model_int8.onnx +1 -1
onnx_models/model_opt.onnx +1 -1
onnx_models/model_uint8.onnx +1 -1
ort_models/model.ort +2 -2
ort_models/model_fp16.ort +2 -2
ort_models/model_int8.ort +2 -2
ort_models/model_uint8.ort +2 -2

README.md CHANGED Viewed

@@ -1,85 +1,85 @@
----
-license: apache-2.0
-tags:
-- onnx
-- ort
----
-# ONNX and ORT models with quantization of [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased)
-[日本語READMEはこちら](README_ja.md)
-This repository contains the ONNX and ORT formats of the model [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased), along with quantized versions.
-## License
-The license for this model is "apache-2.0". For details, please refer to the original model page: [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased).
-## Usage
-To use this model, install ONNX Runtime and perform inference as shown below.
-```python
-# Example code
-import onnxruntime as ort
-import numpy as np
-from transformers import AutoTokenizer
-import os
-# Load the tokenizer
-tokenizer = AutoTokenizer.from_pretrained('google-bert/bert-base-cased')
-# Prepare inputs
-text = 'Replace this text with your input.'
-inputs = tokenizer(text, return_tensors='np')
-# Specify the model paths
-# Test both the ONNX model and the ORT model
-model_paths = [
-    'onnx_models/model_opt.onnx',    # ONNX model
-    'ort_models/model.ort'  # ORT format model
-]
-# Run inference with each model
-for model_path in model_paths:
-    print(f'\n===== Using model: {model_path} =====')
-    # Get the model extension
-    model_extension = os.path.splitext(model_path)[1]
-    # Load the model
-    if model_extension == '.ort':
-        # Load the ORT format model
-        session = ort.InferenceSession(model_path, providers=['CPUExecutionProvider'])
-    else:
-        # Load the ONNX model
-        session = ort.InferenceSession(model_path)
-    # Run inference
-    outputs = session.run(None, dict(inputs))
-    # Display the output shapes
-    for idx, output in enumerate(outputs):
-        print(f'Output {idx} shape: {output.shape}')
-    # Display the results (add further processing if needed)
-    print(outputs)
-```
-## Contents of the Model
-This repository includes the following models:
-### ONNX Models
-- `onnx_models/model.onnx`: Original ONNX model converted from [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased)
-- `onnx_models/model_opt.onnx`: Optimized ONNX model
-- `onnx_models/model_fp16.onnx`: FP16 quantized model
-- `onnx_models/model_int8.onnx`: INT8 quantized model
-- `onnx_models/model_uint8.onnx`: UINT8 quantized model
-### ORT Models
-- `ort_models/model.ort`: ORT model using the optimized ONNX model
-- `ort_models/model_fp16.ort`: ORT model using the FP16 quantized model
-- `ort_models/model_int8.ort`: ORT model using the INT8 quantized model
-- `ort_models/model_uint8.ort`: ORT model using the UINT8 quantized model
-## Notes
-Please adhere to the license and usage conditions of the original model [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased).
-## Contribution
-If you find any issues or have improvements, please create an issue or submit a pull request.

+---
+license: apache-2.0
+tags:
+- onnx
+- ort
+---
+# ONNX and ORT models with quantization of [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased)
+[日本語READMEはこちら](README_ja.md)
+This repository contains the ONNX and ORT formats of the model [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased), along with quantized versions.
+## License
+The license for this model is "apache-2.0". For details, please refer to the original model page: [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased).
+## Usage
+To use this model, install ONNX Runtime and perform inference as shown below.
+```python
+# Example code
+import onnxruntime as ort
+import numpy as np
+from transformers import AutoTokenizer
+import os
+# Load the tokenizer
+tokenizer = AutoTokenizer.from_pretrained('google-bert/bert-base-cased')
+# Prepare inputs
+text = 'Replace this text with your input.'
+inputs = tokenizer(text, return_tensors='np')
+# Specify the model paths
+# Test both the ONNX model and the ORT model
+model_paths = [
+    'onnx_models/model_opt.onnx',    # ONNX model
+    'ort_models/model.ort'  # ORT format model
+]
+# Run inference with each model
+for model_path in model_paths:
+    print(f'\n===== Using model: {model_path} =====')
+    # Get the model extension
+    model_extension = os.path.splitext(model_path)[1]
+    # Load the model
+    if model_extension == '.ort':
+        # Load the ORT format model
+        session = ort.InferenceSession(model_path, providers=['CPUExecutionProvider'])
+    else:
+        # Load the ONNX model
+        session = ort.InferenceSession(model_path)
+    # Run inference
+    outputs = session.run(None, dict(inputs))
+    # Display the output shapes
+    for idx, output in enumerate(outputs):
+        print(f'Output {idx} shape: {output.shape}')
+    # Display the results (add further processing if needed)
+    print(outputs)
+```
+## Contents of the Model
+This repository includes the following models:
+### ONNX Models
+- `onnx_models/model.onnx`: Original ONNX model converted from [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased)
+- `onnx_models/model_opt.onnx`: Optimized ONNX model
+- `onnx_models/model_fp16.onnx`: FP16 quantized model
+- `onnx_models/model_int8.onnx`: INT8 quantized model
+- `onnx_models/model_uint8.onnx`: UINT8 quantized model
+### ORT Models
+- `ort_models/model.ort`: ORT model using the optimized ONNX model
+- `ort_models/model_fp16.ort`: ORT model using the FP16 quantized model
+- `ort_models/model_int8.ort`: ORT model using the INT8 quantized model
+- `ort_models/model_uint8.ort`: ORT model using the UINT8 quantized model
+## Notes
+Please adhere to the license and usage conditions of the original model [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased).
+## Contribution
+If you find any issues or have improvements, please create an issue or submit a pull request.

README_ja.md CHANGED Viewed

@@ -1,85 +1,85 @@
----
-license: apache-2.0
-tags:
-- onnx
-- ort
----
-# [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased) のONNXおよびORTモデルと量子化モデル
-[Click here for the English README](README.md)
-このリポジトリは、元のモデル [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased) をONNXおよびORT形式に変換し、さらに量子化したものです。
-## ライセンス
-このモデルのライセンスは「apache-2.0」です。詳細は元のモデルページ（[google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased)）を参照してください。
-## 使い方
-このモデルを使用するには、ONNX Runtimeをインストールし、以下のように推論を行います。
-```python
-# サンプルコード
-import onnxruntime as ort
-import numpy as np
-from transformers import AutoTokenizer
-import os
-# トークナイザーの読み込み
-tokenizer = AutoTokenizer.from_pretrained('google-bert/bert-base-cased')
-# 入力の準備
-text = 'ここに入力テキストを置き換えてください。'
-inputs = tokenizer(text, return_tensors='np')
-# 使用するモデルのパスを指定
-# ONNXモデルとORTモデルの両方をテストする
-model_paths = [
-    'onnx_models/model_opt.onnx',    # ONNXモデル
-    'ort_models/model.ort'  # ORTフォーマットのモデル
-]
-# モデルごとに推論を実行
-for model_path in model_paths:
-    print(f'\n===== Using model: {model_path} =====')
-    # モデルの拡張子を取得
-    model_extension = os.path.splitext(model_path)[1]
-    # モデルの読み込み
-    if model_extension == '.ort':
-        # ORTフォーマットのモデルをロード
-        session = ort.InferenceSession(model_path, providers=['CPUExecutionProvider'])
-    else:
-        # ONNXモデルをロード
-        session = ort.InferenceSession(model_path)
-    # 推論の実行
-    outputs = session.run(None, dict(inputs))
-    # 出力の形状を表示
-    for idx, output in enumerate(outputs):
-        print(f'Output {idx} shape: {output.shape}')
-    # 結果の表示（必要に応じて処理を追加）
-    print(outputs)
-```
-## モデルの内容
-このリポジトリには、以下のモデルが含まれています。
-### ONNXモデル
-- `onnx_models/model.onnx`: [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased) から変換された元のONNXモデル
-- `onnx_models/model_opt.onnx`: 最適化されたONNXモデル
-- `onnx_models/model_fp16.onnx`: FP16による量子化モデル
-- `onnx_models/model_int8.onnx`: INT8による量子化モデル
-- `onnx_models/model_uint8.onnx`: UINT8による量子化モデル
-### ORTモデル
-- `ort_models/model.ort`: 最適化されたONNXモデルを使用したORTモデル
-- `ort_models/model_fp16.ort`: FP16量子化モデルを使用したORTモデル
-- `ort_models/model_int8.ort`: INT8量子化モデルを使用したORTモデル
-- `ort_models/model_uint8.ort`: UINT8量子化モデルを使用したORTモデル
-## 注意事項
-元のモデル [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased) のライセンスおよび使用条件を遵守してください。
-## 貢献
-問題や改善点があれば、Issueを作成するかプルリクエストを送ってください。

+---
+license: apache-2.0
+tags:
+- onnx
+- ort
+---
+# [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased) のONNXおよびORTモデルと量子化モデル
+[Click here for the English README](README.md)
+このリポジトリは、元のモデル [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased) をONNXおよびORT形式に変換し、さらに量子化したものです。
+## ライセンス
+このモデルのライセンスは「apache-2.0」です。詳細は元のモデルページ（[google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased)）を参照してください。
+## 使い方
+このモデルを使用するには、ONNX Runtimeをインストールし、以下のように推論を行います。
+```python
+# サンプルコード
+import onnxruntime as ort
+import numpy as np
+from transformers import AutoTokenizer
+import os
+# トークナイザーの読み込み
+tokenizer = AutoTokenizer.from_pretrained('google-bert/bert-base-cased')
+# 入力の準備
+text = 'ここに入力テキストを置き換えてください。'
+inputs = tokenizer(text, return_tensors='np')
+# 使用するモデルのパスを指定
+# ONNXモデルとORTモデルの両方をテストする
+model_paths = [
+    'onnx_models/model_opt.onnx',    # ONNXモデル
+    'ort_models/model.ort'  # ORTフォーマットのモデル
+]
+# モデルごとに推論を実行
+for model_path in model_paths:
+    print(f'\n===== Using model: {model_path} =====')
+    # モデルの拡張子を取得
+    model_extension = os.path.splitext(model_path)[1]
+    # モデルの読み込み
+    if model_extension == '.ort':
+        # ORTフォーマットのモデルをロード
+        session = ort.InferenceSession(model_path, providers=['CPUExecutionProvider'])
+    else:
+        # ONNXモデルをロード
+        session = ort.InferenceSession(model_path)
+    # 推論の実行
+    outputs = session.run(None, dict(inputs))
+    # 出力の形状を表示
+    for idx, output in enumerate(outputs):
+        print(f'Output {idx} shape: {output.shape}')
+    # 結果の表示（必要に応じて処理を追加）
+    print(outputs)
+```
+## モデルの内容
+このリポジトリには、以下のモデルが含まれています。
+### ONNXモデル
+- `onnx_models/model.onnx`: [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased) から変換された元のONNXモデル
+- `onnx_models/model_opt.onnx`: 最適化されたONNXモデル
+- `onnx_models/model_fp16.onnx`: FP16による量子化モデル
+- `onnx_models/model_int8.onnx`: INT8による量子化モデル
+- `onnx_models/model_uint8.onnx`: UINT8による量子化モデル
+### ORTモデル
+- `ort_models/model.ort`: 最適化されたONNXモデルを使用したORTモデル
+- `ort_models/model_fp16.ort`: FP16量子化モデルを使用したORTモデル
+- `ort_models/model_int8.ort`: INT8量子化モデルを使用したORTモデル
+- `ort_models/model_uint8.ort`: UINT8量子化モデルを使用したORTモデル
+## 注意事項
+元のモデル [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased) のライセンスおよび使用条件を遵守してください。
+## 貢献
+問題や改善点があれば、Issueを作成するかプルリクエストを送ってください。

onnx_models/model_fp16.onnx CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8018dc667f9c8fa960550506e3ff4049eb234acb1da01690996bf457b7e09468
 size 216873727

 version https://git-lfs.github.com/spec/v1
+oid sha256:9bccba72704ea37f0c4c9cbd1e6f8ec8842d551c8fe5b1493f1f79016c599818
 size 216873727

onnx_models/model_int8.onnx CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:36ddb0c9d57a3405776108990d1e2af20d78baad76f74ec566e9af2b2d8c83d4
 size 109021896

 version https://git-lfs.github.com/spec/v1
+oid sha256:9d0b56c5e358d975d007e955a03bd5329b95a7a456a4c99dafdbb2bb5f276ca4
 size 109021896

onnx_models/model_opt.onnx CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:76ad5e428e005667a83d18cf0041975173bf07b4aaad4156c361a65b193dc81f
 size 433432489

 version https://git-lfs.github.com/spec/v1
+oid sha256:f323cb656d7b6911cd5df0ab62f08bab2326e88c053c6c40893c851695835648
 size 433432489

onnx_models/model_uint8.onnx CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d4b91c713c9c66b1ecb354ed79f127c0c726bbef96c39519e35d78a835c9c71c
 size 109021936

 version https://git-lfs.github.com/spec/v1
+oid sha256:a907112c037031baf01e83ef2a87fb43808926c91e9429dc5fa16db43d1b9c31
 size 109021936

ort_models/model.ort CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4003c01ef5ddd95f484db977712698b1ef1e5c830b82b1d19f24080dd98c8d27
-size 433626880

 version https://git-lfs.github.com/spec/v1
+oid sha256:22b0486744788642f368bd78dc86b0a69b0e2ec3199c2dad073f8e0102b25c6a
+size 433627096

ort_models/model_fp16.ort CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a861273e0cd1fb5d824c6ffa66941b0c98251d4196582afe57a0c47f97f9ed8c
-size 217422160

 version https://git-lfs.github.com/spec/v1
+oid sha256:44e8c28a3b2ba6f664a72b78a77dd5bf786db1129907da4d771b0b557a854708
+size 217422272

ort_models/model_int8.ort CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e19c5797a6e2ea284c76dce83a666e5678dbf398c67bd387960e93b4cb95ecd0
-size 109186776

 version https://git-lfs.github.com/spec/v1
+oid sha256:5a1b8dd94f2e9b6d8ae2ec3228e0f2080bb36820e6fef0b1cfc97946afe0c2bd
+size 109186784

ort_models/model_uint8.ort CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c2e09646d645e45d8f3d8c694ea6f89f631cdef595ae0cd6fda14e2d13dfa7c8
-size 109186776

 version https://git-lfs.github.com/spec/v1
+oid sha256:5544801af732cbc296fa9e85f9e8488700bd8d6250c22b67f43d90036b63e80d
+size 109186784