binhcode25 commited on
Commit
a0f3972
1 Parent(s): d792ef7

Add new SentenceTransformer model.

Browse files
Files changed (3) hide show
  1. README.md +46 -0
  2. model.onnx +2 -2
  3. tokenizer.json +16 -2
README.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: light-embed
3
+ pipeline_tag: sentence-similarity
4
+ tags:
5
+ - sentence-transformers
6
+ - feature-extraction
7
+ - sentence-similarity
8
+
9
+ ---
10
+
11
+ # sbert-all-MiniLM-L6-v2-onnx
12
+
13
+ This is the ONNX version of the Sentence Transformers model sentence-transformers/all-MiniLM-L6-v2 for sentence embedding, optimized for speed and lightweight performance. By utilizing onnxruntime and tokenizers instead of heavier libraries like sentence-transformers and transformers, this version ensures a smaller library size and faster execution. Below are the details of the model:
14
+ - Base model: sentence-transformers/all-MiniLM-L6-v2
15
+ - Embedding dimension: 384
16
+ - Max sequence length: 256
17
+ - File size on disk: 0.08 GB
18
+ - Pooling incorporated: Yes
19
+
20
+ This ONNX model consists all components in the original sentence transformer model:
21
+ Transformer, Pooling, Normalize
22
+
23
+ <!--- Describe your model here -->
24
+
25
+ ## Usage (LightEmbed)
26
+
27
+ Using this model becomes easy when you have [LightEmbed](https://pypi.org/project/light-embed/) installed:
28
+
29
+ ```
30
+ pip install -U light-embed
31
+ ```
32
+
33
+ Then you can use the model like this:
34
+
35
+ ```python
36
+ from light_embed import TextEmbedding
37
+ sentences = ["This is an example sentence", "Each sentence is converted"]
38
+
39
+ model = TextEmbedding('sentence-transformers/all-MiniLM-L6-v2')
40
+ embeddings = model.encode(sentences)
41
+ print(embeddings)
42
+ ```
43
+
44
+ ## Citing & Authors
45
+
46
+ Binh Nguyen / [email protected]
model.onnx CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1fef24b391a698bc5a4941d0349925014ee29cc00c21486e09e238b46936b37f
3
- size 90446038
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bf79aa51e1c7a52c48441b1d2234d6b58d1a9e53a75cc8fc91033606cbb6802f
3
+ size 90446096
tokenizer.json CHANGED
@@ -1,7 +1,21 @@
1
  {
2
  "version": "1.0",
3
- "truncation": null,
4
- "padding": null,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  "added_tokens": [
6
  {
7
  "id": 0,
 
1
  {
2
  "version": "1.0",
3
+ "truncation": {
4
+ "direction": "Right",
5
+ "max_length": 128,
6
+ "strategy": "LongestFirst",
7
+ "stride": 0
8
+ },
9
+ "padding": {
10
+ "strategy": {
11
+ "Fixed": 128
12
+ },
13
+ "direction": "Right",
14
+ "pad_to_multiple_of": null,
15
+ "pad_id": 0,
16
+ "pad_type_id": 0,
17
+ "pad_token": "[PAD]"
18
+ },
19
  "added_tokens": [
20
  {
21
  "id": 0,