louis030195 commited on
Commit
35d8ff9
·
1 Parent(s): 3e4cee6

Delete README.me

Browse files
Files changed (1) hide show
  1. README.me +0 -84
README.me DELETED
@@ -1,84 +0,0 @@
1
- ---
2
- pipeline_tag: sentence-similarity
3
- tags:
4
- - sentence-transformers
5
- - causal-lm
6
- license:
7
- - cc-by-sa-4.0
8
- ---
9
-
10
- # TODO: Name of Model
11
-
12
- TODO: Description
13
-
14
- ## Model Description
15
- TODO: Add relevant content
16
-
17
- (0) Base Transformer Type: RobertaModel
18
-
19
- (1) Pooling mean
20
-
21
-
22
- ## Usage (Sentence-Transformers)
23
-
24
- Using this model becomes more convenient when you have [sentence-transformers](https://github.com/UKPLab/sentence-transformers) installed:
25
-
26
- ```
27
- pip install -U sentence-transformers
28
- ```
29
-
30
- Then you can use the model like this:
31
-
32
- ```python
33
- from sentence_transformers import SentenceTransformer
34
- sentences = ["This is an example sentence"]
35
-
36
- model = SentenceTransformer(TODO)
37
- embeddings = model.encode(sentences)
38
- print(embeddings)
39
- ```
40
-
41
-
42
- ## Usage (HuggingFace Transformers)
43
-
44
- ```python
45
- from transformers import AutoTokenizer, AutoModel
46
- import torch
47
-
48
- # The next step is optional if you want your own pooling function.
49
- # Max Pooling - Take the max value over time for every dimension.
50
- def max_pooling(model_output, attention_mask):
51
- token_embeddings = model_output[0] #First element of model_output contains all token embeddings
52
- input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
53
- token_embeddings[input_mask_expanded == 0] = -1e9 # Set padding tokens to large negative value
54
- max_over_time = torch.max(token_embeddings, 1)[0]
55
- return max_over_time
56
-
57
- # Sentences we want sentence embeddings for
58
- sentences = ['This is an example sentence']
59
-
60
- # Load model from HuggingFace Hub
61
- tokenizer = AutoTokenizer.from_pretrained(TODO)
62
- model = AutoModel.from_pretrained(TODO)
63
-
64
- # Tokenize sentences
65
- encoded_input = tokenizer(sentences, padding=True, truncation=True, max_length=128, return_tensors='pt'))
66
-
67
- # Compute token embeddings
68
- with torch.no_grad():
69
- model_output = model(**encoded_input)
70
-
71
- # Perform pooling. In this case, max pooling.
72
- sentence_embeddings = max_pooling(model_output, encoded_input['attention_mask'])
73
-
74
- print("Sentence embeddings:")
75
- print(sentence_embeddings)
76
- ```
77
-
78
-
79
-
80
- ## TODO: Training Procedure
81
-
82
- ## TODO: Evaluation Results
83
-
84
- ## TODO: Citing & Authors