whitphx HF Staff commited on
Commit
af65a54
Β·
verified Β·
1 Parent(s): d82f415

Add/update the quantized ONNX model files and README.md for Transformers.js v3

Browse files

## Applied Quantizations

### βœ… Based on `decoder_model.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_model_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_model_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_model_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_model_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_model_bnb4.onnx` (added)

### βœ… Based on `decoder_model.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_model_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_model_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_model_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_model_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_model_bnb4.onnx` (added)

### βœ… Based on `decoder_with_past_model.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_with_past_model_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_with_past_model_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_with_past_model_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_with_past_model_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_with_past_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_with_past_model_bnb4.onnx` (added)

### βœ… Based on `decoder_with_past_model.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_with_past_model_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_with_past_model_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_with_past_model_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_with_past_model_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_with_past_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_with_past_model_bnb4.onnx` (added)

### βœ… Based on `decoder_model_merged.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_model_merged_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_model_merged_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_model_merged_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_model_merged_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_model_merged_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_model_merged_bnb4.onnx` (added)

### βœ… Based on `decoder_model_merged.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_model_merged_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_model_merged_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_model_merged_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_model_merged_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_model_merged_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_model_merged_bnb4.onnx` (added)

README.md CHANGED
@@ -5,4 +5,20 @@ library_name: transformers.js
5
 
6
  https://huggingface.co/EleutherAI/pythia-70m with ONNX weights to be compatible with Transformers.js.
7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
 
5
 
6
  https://huggingface.co/EleutherAI/pythia-70m with ONNX weights to be compatible with Transformers.js.
7
 
8
+ ## Usage (Transformers.js)
9
+
10
+ If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
11
+ ```bash
12
+ npm i @huggingface/transformers
13
+ ```
14
+
15
+ **Example:** Text generation.
16
+
17
+ ```js
18
+ import { pipeline } from '@huggingface/transformers';
19
+
20
+ const generator = await pipeline('text-generation', 'Xenova/pythia-70m');
21
+ const output = await generator('Once upon a time, there was', { max_new_tokens: 10 });
22
+ ```
23
+
24
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
onnx/decoder_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fd7d550727c5004ebbdbf63f9bd7fe468e7174b80483f46c366cf91eae83de66
3
+ size 132940207
onnx/decoder_model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:53e81a5be8140a3c874b5818c3d8144cc65e8399ce6daca865cf171d7cc89b84
3
+ size 145374370
onnx/decoder_model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b2a6b78ac5468f1f221c5c284c2e9edd21c549139c360e9ccfb9aa96677b6fc6
3
+ size 75227553
onnx/decoder_model_merged_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:43009bba80ea0ea7bbdc3aceddaceb2483b3f8009d9ae5420e857dde1a45b288
3
+ size 133277163
onnx/decoder_model_merged_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e1c50bf3dcf4df911531422fb3fde5baf14f61db6c56f1e9d9445010f0c1a7eb
3
+ size 145711044
onnx/decoder_model_merged_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:63406c5921b7c2a7d6d1d683df0ef0220624ea8525353ff18b0d2bd6aefdf5c8
3
+ size 75592699
onnx/decoder_model_merged_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:72f41ecb17a42ab5e938fa8abaed2aef43a07703e6c896c11843b2da00920672
3
+ size 136066122
onnx/decoder_model_merged_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7be3a2192b6a901372ecdd7cadba4da471d3a9d17dd3793ca2b390c415fd5c96
3
+ size 81562464
onnx/decoder_model_merged_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f30894f99403ac574b66623da27138e14442246a16cb9d64f9880950f24decfa
3
+ size 75592709
onnx/decoder_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:34a668ebe7cb1a09ac41837f3ac7a5d8eea8fc934eeb67d3d882a423b274b84f
3
+ size 135729391
onnx/decoder_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:18acb2f99f70a359dc0661920381487f94297da618c12585f2cd6b58505bd07e
3
+ size 81222463
onnx/decoder_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ec04d47ac70ff64db1ff2366ec57e9379795a2b834e830ab2a800e54a4fdc467
3
+ size 75227563
onnx/decoder_with_past_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:20b88bedf2bb27eb48c0ddb4862b7b68c2e06fd7db4312bf1aa880d9f01b2d6c
3
+ size 132968287
onnx/decoder_with_past_model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8915b49fdd009724a37fb1e031d81db4a3407af3119b9b81485790616c0357d3
3
+ size 145404621
onnx/decoder_with_past_model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2476d2c4913d0ba1f548e0928404c4ef5b6c2c80bf135e4a014d1035d5ef4285
3
+ size 75255633
onnx/decoder_with_past_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a011f0e3067324ee5f14222260932fe9025097b7538a4969d50b8d3cc53d13dd
3
+ size 135757471
onnx/decoder_with_past_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b0fad7acaf796c5a24ddda42051000174e433481e0842571dceabc6edb4d2a26
3
+ size 81252714
onnx/decoder_with_past_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e7a6f19df59361bef12c78903c2c5dd9ff7a52bb10686f101eefd0e99f714fe
3
+ size 75255643