fkrasnov2 commited on
Commit
6990fe5
·
verified ·
1 Parent(s): 9eb685c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -1
README.md CHANGED
@@ -6,8 +6,11 @@ Encoder-model for search query similarity task.
6
 
7
  Fast and accurate.
8
 
9
- Sentence Piece fitted on 269 million Russian search queries log.
10
 
 
 
 
11
 
12
  ```python
13
  from transformers import AutoModel, AutoTokenizer
@@ -17,6 +20,7 @@ tokenizer = AutoTokenizer.from_pretrained('fkrasnov2/SBE')
17
 
18
  input_ids = tokenizer.encode("чёрное платье", max_length=model.config.max_position_embeddings, truncation=True, return_tensors='pt')
19
 
 
20
  vector = model(input_ids=input_ids, attention_mask=input_ids>3)[0][0,0]
21
 
22
  assert model.config.hidden_size == vector.shape[0]
 
6
 
7
  Fast and accurate.
8
 
9
+ Sentencepiece tokenizer fitted on 269 million Russian search queries log.
10
 
11
+ DeBERTaV2 with a short context length to save the memory.
12
+
13
+ |![Sample preference dataset](https://huggingface.co/fkrasnov2/SBE/bvf_recall1k_query_len_eng.svg)|
14
 
15
  ```python
16
  from transformers import AutoModel, AutoTokenizer
 
20
 
21
  input_ids = tokenizer.encode("чёрное платье", max_length=model.config.max_position_embeddings, truncation=True, return_tensors='pt')
22
 
23
+ model.eval()
24
  vector = model(input_ids=input_ids, attention_mask=input_ids>3)[0][0,0]
25
 
26
  assert model.config.hidden_size == vector.shape[0]