whitphx HF Staff commited on
Commit
1d5d8fe
·
verified ·
1 Parent(s): d43c870

Add/update the quantized ONNX model files and README.md for Transformers.js v3

Browse files

## Applied Quantizations

### ❌ Based on `model.onnx` *with* slimming

```
0%| | 0/1 [00:00<?, ?it/s]
Processing /tmp/tmpfxbvrttu/model.onnx: 0%| | 0/1 [00:00<?, ?it/s]

0%| | 0/5 [00:00<?, ?it/s]

- Quantizing to int8: 0%| | 0/5 [00:00<?, ?it/s]2025-07-22 08:08:18,400 root [INFO] - Quantization parameters for tensor:"/emb_ln/Add_1_output_0" not specified
2025-07-22 08:08:18,407 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.0/attn/MatMul]
2025-07-22 08:08:18,407 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.0/attn/MatMul_1]
2025-07-22 08:08:18,408 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,411 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,427 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,435 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,441 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.1/attn/MatMul]
2025-07-22 08:08:18,441 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.1/attn/MatMul_1]
2025-07-22 08:08:18,442 root [INFO] - Quantization parameters for tensor:"/encoder/layers.1/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,445 root [INFO] - Quantization parameters for tensor:"/encoder/layers.1/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,462 root [INFO] - Quantization parameters for tensor:"/encoder/layers.1/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,470 root [INFO] - Quantization parameters for tensor:"/encoder/layers.1/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,476 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.2/attn/MatMul]
2025-07-22 08:08:18,476 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.2/attn/MatMul_1]
2025-07-22 08:08:18,477 root [INFO] - Quantization parameters for tensor:"/encoder/layers.2/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,480 root [INFO] - Quantization parameters for tensor:"/encoder/layers.2/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,496 root [INFO] - Quantization parameters for tensor:"/encoder/layers.2/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,504 root [INFO] - Quantization parameters for tensor:"/encoder/layers.2/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,510 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.3/attn/MatMul]
2025-07-22 08:08:18,510 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.3/attn/MatMul_1]
2025-07-22 08:08:18,511 root [INFO] - Quantization parameters for tensor:"/encoder/layers.3/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,514 root [INFO] - Quantization parameters for tensor:"/encoder/layers.3/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,529 root [INFO] - Quantization parameters for tensor:"/encoder/layers.3/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,538 root [INFO] - Quantization parameters for tensor:"/encoder/layers.3/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,545 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.4/attn/MatMul]
2025-07-22 08:08:18,545 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.4/attn/MatMul_1]
2025-07-22 08:08:18,546 root [INFO] - Quantization parameters for tensor:"/encoder/layers.4/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,549 root [INFO] - Quantization parameters for tensor:"/encoder/layers.4/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,566 root [INFO] - Quantization parameters for tensor:"/encoder/layers.4/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,575 root [INFO] - Quantization parameters for tensor:"/encoder/layers.4/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,582 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.5/attn/MatMul]
2025-07-22 08:08:18,582 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.5/attn/MatMul_1]
2025-07-22 08:08:18,583 root [INFO] - Quantization parameters for tensor:"/encoder/layers.5/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,586 root [INFO] - Quantization parameters for tensor:"/encoder/layers.5/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,603 root [INFO] - Quantization parameters for tensor:"/encoder/layers.5/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,611 root [INFO] - Quantization parameters for tensor:"/encoder/layers.5/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,618 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.6/attn/MatMul]
2025-07-22 08:08:18,619 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.6/attn/MatMul_1]
2025-07-22 08:08:18,620 root [INFO] - Quantization parameters for tensor:"/encoder/layers.6/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,622 root [INFO] - Quantization parameters for tensor:"/encoder/layers.6/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,640 root [INFO] - Quantization parameters for tensor:"/encoder/layers.6/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,649 root [INFO] - Quantization parameters for tensor:"/encoder/layers.6/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,656 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.7/attn/MatMul]
2025-07-22 08:08:18,656 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.7/attn/MatMul_1]
2025-07-22 08:08:18,657 root [INFO] - Quantization parameters for tensor:"/encoder/layers.7/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,660 root [INFO] - Quantization parameters for tensor:"/encoder/layers.7/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,678 root [INFO] - Quantization parameters for tensor:"/encoder/layers.7/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,686 root [INFO] - Quantization parameters for tensor:"/encoder/layers.7/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,693 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.8/attn/MatMul]
2025-07-22 08:08:18,693 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.8/attn/MatMul_1]
2025-07-22 08:08:18,695 root [INFO] - Quantization parameters for tensor:"/encoder/layers.8/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,698 root [INFO] - Quantization parameters for tensor:"/encoder/layers.8/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,715 root [INFO] - Quantization parameters for tensor:"/encoder/layers.8/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,724 root [INFO] - Quantization parameters for tensor:"/encoder/layers.8/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,731 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.9/attn/MatMul]
2025-07-22 08:08:18,731 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.9/attn/MatMul_1]
2025-07-22 08:08:18,732 root [INFO] - Quantization parameters for tensor:"/encoder/layers.9/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,735 root [INFO] - Quantization parameters for tensor:"/encoder/layers.9/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,753 root [INFO] - Quantization parameters for tensor:"/encoder/layers.9/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,762 root [INFO] - Quantization parameters for tensor:"/encoder/layers.9/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,770 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.10/attn/MatMul]
2025-07-22 08:08:18,770 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.10/attn/MatMul_1]
2025-07-22 08:08:18,771 root [INFO] - Quantization parameters for tensor:"/encoder/layers.10/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,774 root [INFO] - Quantization parameters for tensor:"/encoder/layers.10/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,792 root [INFO] - Quantization parameters for tensor:"/encoder/layers.10/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:18,801 root [INFO] - Quantization parameters for tensor:"/encoder/layers.10/norm2/Add_1_output_0" not specified
2025-07-22 08:08:18,808 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.11/attn/MatMul]
2025-07-22 08:08:18,808 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.11/attn/MatMul_1]
2025-07-22 08:08:18,809 root [INFO] - Quantization parameters for tensor:"/encoder/layers.11/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:18,813 root [INFO] - Quantization parameters for tensor:"/encoder/layers.11/norm1/Add_1_output_0" not specified
2025-07-22 08:08:18,831 root [INFO] - Quantization parameters for tensor:"/encoder/layers.11/mlp/Mul_1_output_0" not specified


- Quantizing to int8: 20%|██ | 1/5 [00:05<00:20, 5.18s/it]

- Quantizing to uint8: 20%|██ | 1/5 [00:05<00:20, 5.18s/it]2025-07-22 08:08:23,006 root [INFO] - Quantization parameters for tensor:"/emb_ln/Add_1_output_0" not specified
2025-07-22 08:08:23,012 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.0/attn/MatMul]
2025-07-22 08:08:23,013 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.0/attn/MatMul_1]
2025-07-22 08:08:23,014 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/attn/Reshape_1_output_0" not specified
2025-07-22 08:08:23,016 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/norm1/Add_1_output_0" not specified
2025-07-22 08:08:23,033 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/mlp/Mul_1_output_0" not specified
2025-07-22 08:08:23,040 root [INFO] - Quantization parameters for tensor:"/encoder/layers.0/norm2/Add_1_output_0" not specified
2025-07-22 08:08:23,046 root [INFO] - Ignore MatMul due to non constant B: /[/encoder/layers.1/att

Files changed (1) hide show
  1. README.md +16 -0
README.md CHANGED
@@ -5,4 +5,20 @@ library_name: transformers.js
5
 
6
  https://huggingface.co/nomic-ai/nomic-embed-text-v1-unsupervised with ONNX weights to be compatible with Transformers.js.
7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
 
5
 
6
  https://huggingface.co/nomic-ai/nomic-embed-text-v1-unsupervised with ONNX weights to be compatible with Transformers.js.
7
 
8
+ ## Usage (Transformers.js)
9
+
10
+ If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
11
+ ```bash
12
+ npm i @huggingface/transformers
13
+ ```
14
+
15
+ **Example:** Run feature extraction.
16
+
17
+ ```js
18
+ import { pipeline } from '@huggingface/transformers';
19
+
20
+ const extractor = await pipeline('feature-extraction', 'Xenova/nomic-embed-text-v1-unsupervised');
21
+ const output = await extractor('This is a simple test.');
22
+ ```
23
+
24
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).