whitphx HF Staff commited on
Commit
fbf9fb6
·
verified ·
1 Parent(s): 0b85f78

Add/update the quantized ONNX model files and README.md for Transformers.js v3

Browse files

## Applied Quantizations

### ❌ Based on `model.onnx` *with* slimming

```
0%| | 0/1 [00:00<?, ?it/s]
Processing /tmp/tmpbl__miwg/model.onnx: 0%| | 0/1 [00:00<?, ?it/s]

0%| | 0/5 [00:00<?, ?it/s]

- Quantizing to int8: 0%| | 0/5 [00:00<?, ?it/s]2025-07-22 08:06:58,873 root [INFO] - Quantization parameters for tensor:"/emb_ln/Add_1_output_0" not specified
2025-07-22 08:06:58,879 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.0/attn/MatMul]
2025-07-22 08:06:58,879 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.0/attn/MatMul_1]
2025-07-22 08:06:58,880 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:58,883 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/norm1/Add_1_output_0" not specified
2025-07-22 08:06:58,900 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:58,908 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/norm2/Add_1_output_0" not specified
2025-07-22 08:06:58,914 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.1/attn/MatMul]
2025-07-22 08:06:58,914 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.1/attn/MatMul_1]
2025-07-22 08:06:58,915 root [INFO] - Quantization parameters for tensor:"/encoder/layers.1/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:58,918 root [INFO] - Quantization parameters for tensor:"/encoder/layers.1/norm1/Add_1_output_0" not specified
2025-07-22 08:06:58,934 root [INFO] - Quantization parameters for tensor:"/encoder/layers.1/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:58,942 root [INFO] - Quantization parameters for tensor:"/encoder/layers.1/norm2/Add_1_output_0" not specified
2025-07-22 08:06:58,948 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.2/attn/MatMul]
2025-07-22 08:06:58,948 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.2/attn/MatMul_1]
2025-07-22 08:06:58,949 root [INFO] - Quantization parameters for tensor:"/encoder/layers.2/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:58,952 root [INFO] - Quantization parameters for tensor:"/encoder/layers.2/norm1/Add_1_output_0" not specified
2025-07-22 08:06:58,969 root [INFO] - Quantization parameters for tensor:"/encoder/layers.2/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:58,977 root [INFO] - Quantization parameters for tensor:"/encoder/layers.2/norm2/Add_1_output_0" not specified
2025-07-22 08:06:58,983 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.3/attn/MatMul]
2025-07-22 08:06:58,983 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.3/attn/MatMul_1]
2025-07-22 08:06:58,984 root [INFO] - Quantization parameters for tensor:"/encoder/layers.3/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:58,987 root [INFO] - Quantization parameters for tensor:"/encoder/layers.3/norm1/Add_1_output_0" not specified
2025-07-22 08:06:59,002 root [INFO] - Quantization parameters for tensor:"/encoder/layers.3/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:59,011 root [INFO] - Quantization parameters for tensor:"/encoder/layers.3/norm2/Add_1_output_0" not specified
2025-07-22 08:06:59,018 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.4/attn/MatMul]
2025-07-22 08:06:59,018 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.4/attn/MatMul_1]
2025-07-22 08:06:59,019 root [INFO] - Quantization parameters for tensor:"/encoder/layers.4/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:59,022 root [INFO] - Quantization parameters for tensor:"/encoder/layers.4/norm1/Add_1_output_0" not specified
2025-07-22 08:06:59,038 root [INFO] - Quantization parameters for tensor:"/encoder/layers.4/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:59,047 root [INFO] - Quantization parameters for tensor:"/encoder/layers.4/norm2/Add_1_output_0" not specified
2025-07-22 08:06:59,054 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.5/attn/MatMul]
2025-07-22 08:06:59,054 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.5/attn/MatMul_1]
2025-07-22 08:06:59,055 root [INFO] - Quantization parameters for tensor:"/encoder/layers.5/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:59,058 root [INFO] - Quantization parameters for tensor:"/encoder/layers.5/norm1/Add_1_output_0" not specified
2025-07-22 08:06:59,076 root [INFO] - Quantization parameters for tensor:"/encoder/layers.5/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:59,084 root [INFO] - Quantization parameters for tensor:"/encoder/layers.5/norm2/Add_1_output_0" not specified
2025-07-22 08:06:59,091 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.6/attn/MatMul]
2025-07-22 08:06:59,091 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.6/attn/MatMul_1]
2025-07-22 08:06:59,092 root [INFO] - Quantization parameters for tensor:"/encoder/layers.6/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:59,095 root [INFO] - Quantization parameters for tensor:"/encoder/layers.6/norm1/Add_1_output_0" not specified
2025-07-22 08:06:59,112 root [INFO] - Quantization parameters for tensor:"/encoder/layers.6/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:59,121 root [INFO] - Quantization parameters for tensor:"/encoder/layers.6/norm2/Add_1_output_0" not specified
2025-07-22 08:06:59,128 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.7/attn/MatMul]
2025-07-22 08:06:59,128 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.7/attn/MatMul_1]
2025-07-22 08:06:59,129 root [INFO] - Quantization parameters for tensor:"/encoder/layers.7/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:59,132 root [INFO] - Quantization parameters for tensor:"/encoder/layers.7/norm1/Add_1_output_0" not specified
2025-07-22 08:06:59,150 root [INFO] - Quantization parameters for tensor:"/encoder/layers.7/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:59,158 root [INFO] - Quantization parameters for tensor:"/encoder/layers.7/norm2/Add_1_output_0" not specified
2025-07-22 08:06:59,165 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.8/attn/MatMul]
2025-07-22 08:06:59,166 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.8/attn/MatMul_1]
2025-07-22 08:06:59,167 root [INFO] - Quantization parameters for tensor:"/encoder/layers.8/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:59,170 root [INFO] - Quantization parameters for tensor:"/encoder/layers.8/norm1/Add_1_output_0" not specified
2025-07-22 08:06:59,188 root [INFO] - Quantization parameters for tensor:"/encoder/layers.8/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:59,196 root [INFO] - Quantization parameters for tensor:"/encoder/layers.8/norm2/Add_1_output_0" not specified
2025-07-22 08:06:59,203 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.9/attn/MatMul]
2025-07-22 08:06:59,203 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.9/attn/MatMul_1]
2025-07-22 08:06:59,204 root [INFO] - Quantization parameters for tensor:"/encoder/layers.9/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:59,207 root [INFO] - Quantization parameters for tensor:"/encoder/layers.9/norm1/Add_1_output_0" not specified
2025-07-22 08:06:59,225 root [INFO] - Quantization parameters for tensor:"/encoder/layers.9/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:59,235 root [INFO] - Quantization parameters for tensor:"/encoder/layers.9/norm2/Add_1_output_0" not specified
2025-07-22 08:06:59,242 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.10/attn/MatMul]
2025-07-22 08:06:59,242 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.10/attn/MatMul_1]
2025-07-22 08:06:59,243 root [INFO] - Quantization parameters for tensor:"/encoder/layers.10/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:59,246 root [INFO] - Quantization parameters for tensor:"/encoder/layers.10/norm1/Add_1_output_0" not specified
2025-07-22 08:06:59,264 root [INFO] - Quantization parameters for tensor:"/encoder/layers.10/mlp/Mul_1_output_0" not specified
2025-07-22 08:06:59,273 root [INFO] - Quantization parameters for tensor:"/encoder/layers.10/norm2/Add_1_output_0" not specified
2025-07-22 08:06:59,280 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.11/attn/MatMul]
2025-07-22 08:06:59,280 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.11/attn/MatMul_1]
2025-07-22 08:06:59,282 root [INFO] - Quantization parameters for tensor:"/encoder/layers.11/attn/Reshape_1_output_0" not specified
2025-07-22 08:06:59,285 root [INFO] - Quantization parameters for tensor:"/encoder/layers.11/norm1/Add_1_output_0" not specified
2025-07-22 08:06:59,303 root [INFO] - Quantization parameters for tensor:"/encoder/layers.11/mlp/Mul_1_output_0" not specified


- Quantizing to int8: 20%|██ | 1/5 [00:05<00:20, 5.19s/it]

- Quantizing to uint8: 20%|██ | 1/5 [00:05<00:20, 5.19s/it]2025-07-22 08:07:03,551 root [INFO] - Quantization parameters for tensor:"/emb_ln/Add_1_output_0" not specified
2025-07-22 08:07:03,557 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.0/attn/MatMul]
2025-07-22 08:07:03,557 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.0/attn/MatMul_1]
2025-07-22 08:07:03,558 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/attn/Reshape_1_output_0" not specified
2025-07-22 08:07:03,561 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/norm1/Add_1_output_0" not specified
2025-07-22 08:07:03,577 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/mlp/Mul_1_output_0" not specified
2025-07-22 08:07:03,585 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/norm2/Add_1_output_0" not specified
2025-07-22 08:07:03,591 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.1/att

Files changed (1) hide show
  1. README.md +16 -0
README.md CHANGED
@@ -5,4 +5,20 @@ library_name: transformers.js
5
 
6
  https://huggingface.co/nomic-ai/nomic-embed-text-v1 with ONNX weights to be compatible with Transformers.js.
7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
 
5
 
6
  https://huggingface.co/nomic-ai/nomic-embed-text-v1 with ONNX weights to be compatible with Transformers.js.
7
 
8
+ ## Usage (Transformers.js)
9
+
10
+ If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
11
+ ```bash
12
+ npm i @huggingface/transformers
13
+ ```
14
+
15
+ **Example:** Run feature extraction.
16
+
17
+ ```js
18
+ import { pipeline } from '@huggingface/transformers';
19
+
20
+ const extractor = await pipeline('feature-extraction', 'Xenova/nomic-embed-text-v1');
21
+ const output = await extractor('This is a simple test.');
22
+ ```
23
+
24
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).