Model Card for deberta-v3-large-Rationale-to-Score

This repository hosts a version of microsoft/deberta-v3-large that has been fine-tuned to assess text-based rationales and generate corresponding scores. As shown in the examples, the model processes a given free-text rationale and outputs a numerical score.

For a comprehensive understanding of the training process and methodologies employed, please refer to our detailed research paper: Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring.

If you utilize this model in your research, please acknowledge it by citing our work:

Citation Information

@misc{li2024calibratingllmspreferenceoptimization,
      title={Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring}, 
      author={Jiazheng Li and Hainiu Xu and Zhaoyue Sun and Yuxiang Zhou and David West and Cesare Aloisi and Yulan He},
      year={2024},
      eprint={2406.19949},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2406.19949}, 
}

jiazhengli
/

deberta-v3-large-Rationale-to-Score

Model Card for deberta-v3-large-Rationale-to-Score

Citation Information

Model tree for jiazhengli/deberta-v3-large-Rationale-to-Score

Collection including jiazhengli/deberta-v3-large-Rationale-to-Score

MCTS with Preference Optimisation

Evaluation results