chaoyi-wu commited on
Commit
b496201
·
1 Parent(s): 737d523

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -1
README.md CHANGED
@@ -6,4 +6,33 @@ language:
6
  - en
7
  tags:
8
  - medical
9
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  - en
7
  tags:
8
  - medical
9
+ ---
10
+
11
+ This repo contains the latest version of PMC_LLaMA_7B, which is LLaMA-7b finetuned on the PMC papers in the S2ORC dataset.
12
+
13
+ The model was trained with the following hyperparameters:
14
+
15
+ * Epochs: **10**
16
+ * Batch size: 128
17
+ * Cutoff length: 512
18
+ * Learning rate: 2e-5
19
+
20
+ Each epoch we sample 512 tokens per paper for training.
21
+
22
+ The model can be loaded as follows:
23
+
24
+ ```
25
+ import transformers
26
+ import torch
27
+ tokenizer = transformers.LlamaTokenizer.from_pretrained('chaoyi-wu/PMC_LLAMA_7B_10_epoch')
28
+ model = transformers.LlamaForCausalLM.from_pretrained('chaoyi-wu/PMC_LLAMA_7B_10_epoch')
29
+ sentence = 'Hello, doctor'
30
+ batch = tokenizer(
31
+ sentence,
32
+ return_tensors="pt",
33
+ add_special_tokens=False
34
+ )
35
+ with torch.no_grad():
36
+ generated = model.generate(inputs = batch["input_ids"], max_length=200, do_sample=True, top_k=50)
37
+ print('model predict: ',tokenizer.decode(generated[0]))
38
+ ```