Add/update the quantized ONNX model files and README.md for Transformers.js v3
Browse files## Applied Quantizations
### ❌ Based on `model.onnx` *with* slimming
```
0%| | 0/1 [00:00<?, ?it/s]
Processing /tmp/tmpfxbvrttu/model.onnx: 0%| | 0/1 [00:00<?, ?it/s]
0%| | 0/5 [00:00<?, ?it/s][A
- Quantizing to int8: 0%| | 0/5 [00:00<?, ?it/s][A2025-07-22 08:08:18,400 root [INFO] - Quantization parameters for tensor:"/emb_ln/Add_1_output_0" not specified
2025-07-22 08:08:18,407 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.0/attn/MatMul]
2025-07-22 08:08:18,407 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.0/attn/MatMul_1]
2025-07-22 08:08:18,408 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,411 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,427 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,435 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,441 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.1/attn/MatMul]
2025-07-22 08:08:18,441 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.1/attn/MatMul_1]
2025-07-22 08:08:18,442 root [INFO] - Quantization parameters for tensor:"/encoder/layers.1/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,445 root [INFO] - Quantization parameters for tensor:"/encoder/layers.1/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,462 root [INFO] - Quantization parameters for tensor:"/encoder/layers.1/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,470 root [INFO] - Quantization parameters for tensor:"/encoder/layers.1/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,476 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.2/attn/MatMul]
2025-07-22 08:08:18,476 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.2/attn/MatMul_1]
2025-07-22 08:08:18,477 root [INFO] - Quantization parameters for tensor:"/encoder/layers.2/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,480 root [INFO] - Quantization parameters for tensor:"/encoder/layers.2/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,496 root [INFO] - Quantization parameters for tensor:"/encoder/layers.2/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,504 root [INFO] - Quantization parameters for tensor:"/encoder/layers.2/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,510 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.3/attn/MatMul]
2025-07-22 08:08:18,510 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.3/attn/MatMul_1]
2025-07-22 08:08:18,511 root [INFO] - Quantization parameters for tensor:"/encoder/layers.3/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,514 root [INFO] - Quantization parameters for tensor:"/encoder/layers.3/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,529 root [INFO] - Quantization parameters for tensor:"/encoder/layers.3/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,538 root [INFO] - Quantization parameters for tensor:"/encoder/layers.3/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,545 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.4/attn/MatMul]
2025-07-22 08:08:18,545 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.4/attn/MatMul_1]
2025-07-22 08:08:18,546 root [INFO] - Quantization parameters for tensor:"/encoder/layers.4/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,549 root [INFO] - Quantization parameters for tensor:"/encoder/layers.4/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,566 root [INFO] - Quantization parameters for tensor:"/encoder/layers.4/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,575 root [INFO] - Quantization parameters for tensor:"/encoder/layers.4/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,582 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.5/attn/MatMul]
2025-07-22 08:08:18,582 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.5/attn/MatMul_1]
2025-07-22 08:08:18,583 root [INFO] - Quantization parameters for tensor:"/encoder/layers.5/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,586 root [INFO] - Quantization parameters for tensor:"/encoder/layers.5/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,603 root [INFO] - Quantization parameters for tensor:"/encoder/layers.5/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,611 root [INFO] - Quantization parameters for tensor:"/encoder/layers.5/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,618 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.6/attn/MatMul]
2025-07-22 08:08:18,619 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.6/attn/MatMul_1]
2025-07-22 08:08:18,620 root [INFO] - Quantization parameters for tensor:"/encoder/layers.6/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,622 root [INFO] - Quantization parameters for tensor:"/encoder/layers.6/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,640 root [INFO] - Quantization parameters for tensor:"/encoder/layers.6/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,649 root [INFO] - Quantization parameters for tensor:"/encoder/layers.6/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,656 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.7/attn/MatMul]
2025-07-22 08:08:18,656 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.7/attn/MatMul_1]
2025-07-22 08:08:18,657 root [INFO] - Quantization parameters for tensor:"/encoder/layers.7/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,660 root [INFO] - Quantization parameters for tensor:"/encoder/layers.7/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,678 root [INFO] - Quantization parameters for tensor:"/encoder/layers.7/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,686 root [INFO] - Quantization parameters for tensor:"/encoder/layers.7/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,693 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.8/attn/MatMul]
2025-07-22 08:08:18,693 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.8/attn/MatMul_1]
2025-07-22 08:08:18,695 root [INFO] - Quantization parameters for tensor:"/encoder/layers.8/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,698 root [INFO] - Quantization parameters for tensor:"/encoder/layers.8/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,715 root [INFO] - Quantization parameters for tensor:"/encoder/layers.8/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,724 root [INFO] - Quantization parameters for tensor:"/encoder/layers.8/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,731 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.9/attn/MatMul]
2025-07-22 08:08:18,731 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.9/attn/MatMul_1]
2025-07-22 08:08:18,732 root [INFO] - Quantization parameters for tensor:"/encoder/layers.9/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,735 root [INFO] - Quantization parameters for tensor:"/encoder/layers.9/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,753 root [INFO] - Quantization parameters for tensor:"/encoder/layers.9/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,762 root [INFO] - Quantization parameters for tensor:"/encoder/layers.9/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,770 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.10/attn/MatMul]
2025-07-22 08:08:18,770 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.10/attn/MatMul_1]
2025-07-22 08:08:18,771 root [INFO] - Quantization parameters for tensor:"/encoder/layers.10/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,774 root [INFO] - Quantization parameters for tensor:"/encoder/layers.10/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,792 root [INFO] - Quantization parameters for tensor:"/encoder/layers.10/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,801 root [INFO] - Quantization parameters for tensor:"/encoder/layers.10/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,808 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.11/attn/MatMul]
2025-07-22 08:08:18,808 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.11/attn/MatMul_1]
2025-07-22 08:08:18,809 root [INFO] - Quantization parameters for tensor:"/encoder/layers.11/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,813 root [INFO] - Quantization parameters for tensor:"/encoder/layers.11/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,831 root [INFO] - Quantization parameters for tensor:"/encoder/layers.11/mlp/Mul_1_output_0" not specified
- Quantizing to int8: 20%|██ | 1/5 [00:05<00:20, 5.18s/it][A
- Quantizing to uint8: 20%|██ | 1/5 [00:05<00:20, 5.18s/it][A2025-07-22 08:08:23,006 root [INFO] - Quantization parameters for tensor:"/emb_ln/Add_1_output_0" not specified
2025-07-22 08:08:23,012 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.0/attn/MatMul]
2025-07-22 08:08:23,013 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.0/attn/MatMul_1]
2025-07-22 08:08:23,014 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:23,016 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/norm1/Add_1_output_0" not specified
2025-07-22 08:08:23,033 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:23,040 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/norm2/Add_1_output_0" not specified
2025-07-22 08:08:23,046 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.1/att
@@ -5,4 +5,20 @@ library_name: transformers.js
|
|
5 |
|
6 |
https://huggingface.co/nomic-ai/nomic-embed-text-v1-unsupervised with ONNX weights to be compatible with Transformers.js.
|
7 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
|
|
|
5 |
|
6 |
https://huggingface.co/nomic-ai/nomic-embed-text-v1-unsupervised with ONNX weights to be compatible with Transformers.js.
|
7 |
|
8 |
+
## Usage (Transformers.js)
|
9 |
+
|
10 |
+
If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
|
11 |
+
```bash
|
12 |
+
npm i @huggingface/transformers
|
13 |
+
```
|
14 |
+
|
15 |
+
**Example:** Run feature extraction.
|
16 |
+
|
17 |
+
```js
|
18 |
+
import { pipeline } from '@huggingface/transformers';
|
19 |
+
|
20 |
+
const extractor = await pipeline('feature-extraction', 'Xenova/nomic-embed-text-v1-unsupervised');
|
21 |
+
const output = await extractor('This is a simple test.');
|
22 |
+
```
|
23 |
+
|
24 |
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
|