Update README.md
Browse files
README.md
CHANGED
@@ -41,7 +41,7 @@ We recommend using the latest version of HF Transformers, or any `transformers>=
|
|
41 |
Below we provide a code snippet demonstrating how to load the tokenizer and model and score a candidate instruction. We strongly recommend to format the instruction input as shown to maintain consistency with the format of the data used during training of MDCureRM. As the model outputs values normalized to the 0-1 range, we scale outputted scores up to the 1-5 range for more interpretable results. Relative weighting of fine-grained rewards may be configured as desired to obtain the final score; we reproduce the weights used in our implementation in `reward_weights` below.
|
42 |
|
43 |
```python
|
44 |
-
from transformers import AutoTokenizer, AutoModel, LlamaConfig, PreTrainedModel, LlamaForSequenceClassification
|
45 |
import torch.nn as nn
|
46 |
import torch
|
47 |
|
@@ -101,6 +101,9 @@ class RewardModel(PreTrainedModel):
|
|
101 |
def prepare_inputs_for_generation(self, *args, **kwargs):
|
102 |
return self.BASE_MODEL.prepare_inputs_for_generation(*args, **kwargs)
|
103 |
|
|
|
|
|
|
|
104 |
model = AutoModel.from_pretrained("yale-nlp/MDCureRM").to(torch.device("cuda"))
|
105 |
tokenizer = AutoTokenizer.from_pretrained("yale-nlp/MDCureRM", use_fast=True)
|
106 |
tokenizer.pad_token = tokenizer.eos_token
|
|
|
41 |
Below we provide a code snippet demonstrating how to load the tokenizer and model and score a candidate instruction. We strongly recommend to format the instruction input as shown to maintain consistency with the format of the data used during training of MDCureRM. As the model outputs values normalized to the 0-1 range, we scale outputted scores up to the 1-5 range for more interpretable results. Relative weighting of fine-grained rewards may be configured as desired to obtain the final score; we reproduce the weights used in our implementation in `reward_weights` below.
|
42 |
|
43 |
```python
|
44 |
+
from transformers import AutoTokenizer, AutoModel, AutoConfig, LlamaConfig, PreTrainedModel, LlamaForSequenceClassification
|
45 |
import torch.nn as nn
|
46 |
import torch
|
47 |
|
|
|
101 |
def prepare_inputs_for_generation(self, *args, **kwargs):
|
102 |
return self.BASE_MODEL.prepare_inputs_for_generation(*args, **kwargs)
|
103 |
|
104 |
+
AutoConfig.register("RewardModel", RewardModelConfig)
|
105 |
+
AutoModel.register(RewardModelConfig, RewardModel)
|
106 |
+
|
107 |
model = AutoModel.from_pretrained("yale-nlp/MDCureRM").to(torch.device("cuda"))
|
108 |
tokenizer = AutoTokenizer.from_pretrained("yale-nlp/MDCureRM", use_fast=True)
|
109 |
tokenizer.pad_token = tokenizer.eos_token
|