Add/update the quantized ONNX model files and README.md for Transformers.js v3
Browse files## Applied Quantizations
### ❌ Based on `model.onnx` *with* slimming
```
0%| | 0/1 [00:00<?, ?it/s]
Processing /tmp/tmpbl__miwg/model.onnx: 0%| | 0/1 [00:00<?, ?it/s]
0%| | 0/5 [00:00<?, ?it/s][A
- Quantizing to int8: 0%| | 0/5 [00:00<?, ?it/s][A2025-07-22 08:06:58,873 root [INFO] - Quantization parameters for tensor:"/emb_ln/Add_1_output_0" not specified
2025-07-22 08:06:58,879 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.0/attn/MatMul]
2025-07-22 08:06:58,879 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.0/attn/MatMul_1]
2025-07-22 08:06:58,880 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:58,883 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/norm1/Add_1_output_0" not specified
2025-07-22 08:06:58,900 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:58,908 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/norm2/Add_1_output_0" not specified
2025-07-22 08:06:58,914 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.1/attn/MatMul]
2025-07-22 08:06:58,914 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.1/attn/MatMul_1]
2025-07-22 08:06:58,915 root [INFO] - Quantization parameters for tensor:"/encoder/layers.1/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:58,918 root [INFO] - Quantization parameters for tensor:"/encoder/layers.1/norm1/Add_1_output_0" not specified
2025-07-22 08:06:58,934 root [INFO] - Quantization parameters for tensor:"/encoder/layers.1/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:58,942 root [INFO] - Quantization parameters for tensor:"/encoder/layers.1/norm2/Add_1_output_0" not specified
2025-07-22 08:06:58,948 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.2/attn/MatMul]
2025-07-22 08:06:58,948 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.2/attn/MatMul_1]
2025-07-22 08:06:58,949 root [INFO] - Quantization parameters for tensor:"/encoder/layers.2/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:58,952 root [INFO] - Quantization parameters for tensor:"/encoder/layers.2/norm1/Add_1_output_0" not specified
2025-07-22 08:06:58,969 root [INFO] - Quantization parameters for tensor:"/encoder/layers.2/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:58,977 root [INFO] - Quantization parameters for tensor:"/encoder/layers.2/norm2/Add_1_output_0" not specified
2025-07-22 08:06:58,983 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.3/attn/MatMul]
2025-07-22 08:06:58,983 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.3/attn/MatMul_1]
2025-07-22 08:06:58,984 root [INFO] - Quantization parameters for tensor:"/encoder/layers.3/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:58,987 root [INFO] - Quantization parameters for tensor:"/encoder/layers.3/norm1/Add_1_output_0" not specified
2025-07-22 08:06:59,002 root [INFO] - Quantization parameters for tensor:"/encoder/layers.3/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:59,011 root [INFO] - Quantization parameters for tensor:"/encoder/layers.3/norm2/Add_1_output_0" not specified
2025-07-22 08:06:59,018 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.4/attn/MatMul]
2025-07-22 08:06:59,018 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.4/attn/MatMul_1]
2025-07-22 08:06:59,019 root [INFO] - Quantization parameters for tensor:"/encoder/layers.4/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:59,022 root [INFO] - Quantization parameters for tensor:"/encoder/layers.4/norm1/Add_1_output_0" not specified
2025-07-22 08:06:59,038 root [INFO] - Quantization parameters for tensor:"/encoder/layers.4/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:59,047 root [INFO] - Quantization parameters for tensor:"/encoder/layers.4/norm2/Add_1_output_0" not specified
2025-07-22 08:06:59,054 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.5/attn/MatMul]
2025-07-22 08:06:59,054 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.5/attn/MatMul_1]
2025-07-22 08:06:59,055 root [INFO] - Quantization parameters for tensor:"/encoder/layers.5/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:59,058 root [INFO] - Quantization parameters for tensor:"/encoder/layers.5/norm1/Add_1_output_0" not specified
2025-07-22 08:06:59,076 root [INFO] - Quantization parameters for tensor:"/encoder/layers.5/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:59,084 root [INFO] - Quantization parameters for tensor:"/encoder/layers.5/norm2/Add_1_output_0" not specified
2025-07-22 08:06:59,091 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.6/attn/MatMul]
2025-07-22 08:06:59,091 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.6/attn/MatMul_1]
2025-07-22 08:06:59,092 root [INFO] - Quantization parameters for tensor:"/encoder/layers.6/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:59,095 root [INFO] - Quantization parameters for tensor:"/encoder/layers.6/norm1/Add_1_output_0" not specified
2025-07-22 08:06:59,112 root [INFO] - Quantization parameters for tensor:"/encoder/layers.6/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:59,121 root [INFO] - Quantization parameters for tensor:"/encoder/layers.6/norm2/Add_1_output_0" not specified
2025-07-22 08:06:59,128 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.7/attn/MatMul]
2025-07-22 08:06:59,128 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.7/attn/MatMul_1]
2025-07-22 08:06:59,129 root [INFO] - Quantization parameters for tensor:"/encoder/layers.7/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:59,132 root [INFO] - Quantization parameters for tensor:"/encoder/layers.7/norm1/Add_1_output_0" not specified
2025-07-22 08:06:59,150 root [INFO] - Quantization parameters for tensor:"/encoder/layers.7/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:59,158 root [INFO] - Quantization parameters for tensor:"/encoder/layers.7/norm2/Add_1_output_0" not specified
2025-07-22 08:06:59,165 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.8/attn/MatMul]
2025-07-22 08:06:59,166 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.8/attn/MatMul_1]
2025-07-22 08:06:59,167 root [INFO] - Quantization parameters for tensor:"/encoder/layers.8/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:59,170 root [INFO] - Quantization parameters for tensor:"/encoder/layers.8/norm1/Add_1_output_0" not specified
2025-07-22 08:06:59,188 root [INFO] - Quantization parameters for tensor:"/encoder/layers.8/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:59,196 root [INFO] - Quantization parameters for tensor:"/encoder/layers.8/norm2/Add_1_output_0" not specified
2025-07-22 08:06:59,203 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.9/attn/MatMul]
2025-07-22 08:06:59,203 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.9/attn/MatMul_1]
2025-07-22 08:06:59,204 root [INFO] - Quantization parameters for tensor:"/encoder/layers.9/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:59,207 root [INFO] - Quantization parameters for tensor:"/encoder/layers.9/norm1/Add_1_output_0" not specified
2025-07-22 08:06:59,225 root [INFO] - Quantization parameters for tensor:"/encoder/layers.9/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:59,235 root [INFO] - Quantization parameters for tensor:"/encoder/layers.9/norm2/Add_1_output_0" not specified
2025-07-22 08:06:59,242 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.10/attn/MatMul]
2025-07-22 08:06:59,242 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.10/attn/MatMul_1]
2025-07-22 08:06:59,243 root [INFO] - Quantization parameters for tensor:"/encoder/layers.10/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:59,246 root [INFO] - Quantization parameters for tensor:"/encoder/layers.10/norm1/Add_1_output_0" not specified
2025-07-22 08:06:59,264 root [INFO] - Quantization parameters for tensor:"/encoder/layers.10/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:59,273 root [INFO] - Quantization parameters for tensor:"/encoder/layers.10/norm2/Add_1_output_0" not specified
2025-07-22 08:06:59,280 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.11/attn/MatMul]
2025-07-22 08:06:59,280 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.11/attn/MatMul_1]
2025-07-22 08:06:59,282 root [INFO] - Quantization parameters for tensor:"/encoder/layers.11/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:59,285 root [INFO] - Quantization parameters for tensor:"/encoder/layers.11/norm1/Add_1_output_0" not specified
2025-07-22 08:06:59,303 root [INFO] - Quantization parameters for tensor:"/encoder/layers.11/mlp/Mul_1_output_0" not specified
- Quantizing to int8: 20%|██ | 1/5 [00:05<00:20, 5.19s/it][A
- Quantizing to uint8: 20%|██ | 1/5 [00:05<00:20, 5.19s/it][A2025-07-22 08:07:03,551 root [INFO] - Quantization parameters for tensor:"/emb_ln/Add_1_output_0" not specified
2025-07-22 08:07:03,557 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.0/attn/MatMul]
2025-07-22 08:07:03,557 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.0/attn/MatMul_1]
2025-07-22 08:07:03,558 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/attn/Reshape_1_output_0" not specified
2025-07-22 08:07:03,561 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/norm1/Add_1_output_0" not specified
2025-07-22 08:07:03,577 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/mlp/Mul_1_output_0" not specified
2025-07-22 08:07:03,585 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/norm2/Add_1_output_0" not specified
2025-07-22 08:07:03,591 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.1/att
@@ -5,4 +5,20 @@ library_name: transformers.js
|
|
5 |
|
6 |
https://huggingface.co/nomic-ai/nomic-embed-text-v1 with ONNX weights to be compatible with Transformers.js.
|
7 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
|
|
|
5 |
|
6 |
https://huggingface.co/nomic-ai/nomic-embed-text-v1 with ONNX weights to be compatible with Transformers.js.
|
7 |
|
8 |
+
## Usage (Transformers.js)
|
9 |
+
|
10 |
+
If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
|
11 |
+
```bash
|
12 |
+
npm i @huggingface/transformers
|
13 |
+
```
|
14 |
+
|
15 |
+
**Example:** Run feature extraction.
|
16 |
+
|
17 |
+
```js
|
18 |
+
import { pipeline } from '@huggingface/transformers';
|
19 |
+
|
20 |
+
const extractor = await pipeline('feature-extraction', 'Xenova/nomic-embed-text-v1');
|
21 |
+
const output = await extractor('This is a simple test.');
|
22 |
+
```
|
23 |
+
|
24 |
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
|