--- library_name: transformers base_model: - facebook/opt-350m pipeline_tag: text-classification --- This is the official reward model developed in the paper "[Expert of Experts Verification and Alignment (EVAL) Framework for Large Language Models Safety in Gastroenterology](https://www.nature.com/articles/s41746-025-01589-z)". To load the model from huggingface: ``` from transformers import pipeline grader = pipeline( "text-classification", model="ZachariahPang/medical_reward_model", truncation=True, max_length=2048, ) ``` The model was trained using the template `Question: Answer: `. To ensure best performance, make sure to use the same format for the input. You can use our helper functions `grade()` for scoring individual answers or `rejection_sampling()` for selecting the best answer from multiple candidates. Example usage: ``` def format_input(question: str | list[str], answer: str | list[str]) -> list[str]: if isinstance(question, str): question = [question] if isinstance(answer, str): answer = [answer] assert len(question) == len(answer), "question and answer must have the same length" return [f"Question: {q} Answer: {a}" for q, a in zip(question, answer)] ``` To grade one or several question-answer pairs, first load the model and then: ``` def grade(question: str|list[str], answer: str|list[str], grader: pipeline) -> list[float]: """Grade one or multiple question-answer pairs using the reward model. Args: question (str|list[str]): Either a single question string or a list of questions answer (str|list[str]): Either a single answer string or a list of answers. If lists are provided, their length must match the questions list, and the order should correspond (i.e., questions[i] pairs with answers[i]) grader (pipeline): The loaded reward model pipeline Returns: list[float]: A list of labels ("positive" or "negative") for each Q&A pair """ inputs = format_input(question, answer) outputs = grader(inputs) return [output["label"] for output in outputs] ``` To do rejection sampling with the reward model to get the best answer, first load the model and then: ``` def rejection_sampling( question: str, answer_candidates: list[str], grader: pipeline ) -> str: """Select the best answer from candidates using the reward model scores. Args: question (str): The input question to find the best answer for answer_candidates (list[str]): List of potential answers to choose from grader (pipeline): The loaded reward model pipeline Returns: str: The answer with the highest positive score from the candidates """ questions = [question] * len(answer_candidates) inputs = format_input(questions, answer_candidates) outputs = grader(inputs) positive_score = [] # `score` is the probability of the answer towards the question being `label` # convert all scores to "positive" scores to be used for argmax for output in outputs: if output["label"] == "positive": positive_score.append(output["score"]) if output["label"] == "negative": positive_score.append(1 - output["score"]) return answer_candidates[np.argmax(positive_score)] ```