hugohrban commited on
Commit
4e2e6d0
·
verified ·
1 Parent(s): 6c45e77

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -1
README.md CHANGED
@@ -9,4 +9,33 @@ ProGen2-small finetuned on 7 protein families.
9
 
10
  Bidirectional model trained on both N -> C and C -> N directions of protein sequences, specified by tokens "1" and "2" respectively.
11
 
12
- See [github repo](https://github.com/hugohrban/ProGen2-finetuning/tree/main) for more info.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
10
  Bidirectional model trained on both N -> C and C -> N directions of protein sequences, specified by tokens "1" and "2" respectively.
11
 
12
+ See my [github repo](https://github.com/hugohrban/ProGen2-finetuning/tree/main) for more information.
13
+
14
+ Example usage:
15
+
16
+ ```python
17
+ from transformers import AutoModelForCausalLM
18
+ from transformers import AutoTokenizer
19
+ # optionally use local imports
20
+ # from models.progen.modeling_progen import ProGenForCausalLM
21
+ # from models.progen.configuration_progen import ProGenConfig
22
+ import torch
23
+ import torch.nn.functional as F
24
+
25
+ # load model and tokenizer
26
+ model = AutoModelForCausalLM.from_pretrained("hugohrban/progen2-small-mix7-bidi", trust_remote_code=True)
27
+ tokenizer = AutoTokenizer.from_pretrained("hugohrban/progen2-small-mix7-bidi", trust_remote_code=True)
28
+
29
+ # prepare input
30
+ prompt = "<|pf00125|>2FDDDVSAVKSTGV"
31
+ input_ids = torch.tensor(tokenizer.encode(prompt)).to(model.device)
32
+
33
+ # forward pass
34
+ logits = model(input_ids).logits
35
+
36
+ # print output probabilities
37
+ next_token_logits = logits[-1, :]
38
+ next_token_probs = F.softmax(next_token_logits, dim=-1)
39
+ for i, prob in enumerate(next_token_probs):
40
+ print(f"{tokenizer.decode(i)}: {100 * prob:.2f}%")
41
+ ```