Update README.md
Browse files
README.md
CHANGED
@@ -8,7 +8,7 @@ tags:
|
|
8 |
|
9 |
---
|
10 |
|
11 |
-
#
|
12 |
|
13 |
This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
|
14 |
|
@@ -26,9 +26,12 @@ Then you can use the model like this:
|
|
26 |
|
27 |
```python
|
28 |
from sentence_transformers import SentenceTransformer
|
29 |
-
sentences = ["
|
|
|
|
|
|
|
30 |
|
31 |
-
model = SentenceTransformer('
|
32 |
embeddings = model.encode(sentences)
|
33 |
print(embeddings)
|
34 |
```
|
@@ -51,11 +54,15 @@ def mean_pooling(model_output, attention_mask):
|
|
51 |
|
52 |
|
53 |
# Sentences we want sentence embeddings for
|
54 |
-
sentences = [
|
|
|
|
|
|
|
|
|
55 |
|
56 |
# Load model from HuggingFace Hub
|
57 |
-
tokenizer = AutoTokenizer.from_pretrained('
|
58 |
-
model = AutoModel.from_pretrained('
|
59 |
|
60 |
# Tokenize sentences
|
61 |
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
|
@@ -85,7 +92,7 @@ The model was trained with the parameters:
|
|
85 |
|
86 |
**DataLoader**:
|
87 |
|
88 |
-
`sentence_transformers.datasets.NoDuplicatesDataLoader.NoDuplicatesDataLoader` of length
|
89 |
```
|
90 |
{'batch_size': 16}
|
91 |
```
|
@@ -126,4 +133,4 @@ SentenceTransformer(
|
|
126 |
|
127 |
## Citing & Authors
|
128 |
|
129 |
-
<!--- Describe where people can find more information -->
|
|
|
8 |
|
9 |
---
|
10 |
|
11 |
+
# indo-sbert-base
|
12 |
|
13 |
This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
|
14 |
|
|
|
26 |
|
27 |
```python
|
28 |
from sentence_transformers import SentenceTransformer
|
29 |
+
sentences = ["Ibukota Perancis adalah Paris",
|
30 |
+
"Menara Eifel terletak di Paris, Perancis",
|
31 |
+
"Pizza adalah makanan khas Italia",
|
32 |
+
"Saya kuliah di Carneige Melon University"]
|
33 |
|
34 |
+
model = SentenceTransformer('firqaaa/indo-sbert-finetuned-anli-id')
|
35 |
embeddings = model.encode(sentences)
|
36 |
print(embeddings)
|
37 |
```
|
|
|
54 |
|
55 |
|
56 |
# Sentences we want sentence embeddings for
|
57 |
+
sentences = ["Ibukota Perancis adalah Paris",
|
58 |
+
"Menara Eifel terletak di Paris, Perancis",
|
59 |
+
"Pizza adalah makanan khas Italia",
|
60 |
+
"Saya kuliah di Carneige Melon University"]
|
61 |
+
|
62 |
|
63 |
# Load model from HuggingFace Hub
|
64 |
+
tokenizer = AutoTokenizer.from_pretrained('firqaaa/indo-sbert-finetuned-anli-id')
|
65 |
+
model = AutoModel.from_pretrained('firqaaa/indo-sbert-finetuned-anli-id')
|
66 |
|
67 |
# Tokenize sentences
|
68 |
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
|
|
|
92 |
|
93 |
**DataLoader**:
|
94 |
|
95 |
+
`sentence_transformers.datasets.NoDuplicatesDataLoader.NoDuplicatesDataLoader` of length 19644 with parameters:
|
96 |
```
|
97 |
{'batch_size': 16}
|
98 |
```
|
|
|
133 |
|
134 |
## Citing & Authors
|
135 |
|
136 |
+
<!--- Describe where people can find more information -->
|