kangqi-ni
/

Llama-3.1-8B-Instruct_bio-tutor_dpo

Model card Files Files and versions Community

Llama-3.1-8B-Instruct_bio-tutor_dpo / README.md

kangqi-ni's picture

Update README.md

50ca12f verified 9 days ago

|

history blame contribute delete

621 Bytes

	---
	license: llama3.1
	language:
	- en
	tags:
	- dpo
	- biology
	- education
	- llama
	---

	This model is trained on Llama-3.1-8B-Instruct with SFT and DPO. The purpose is to develop a more capable educational chatbot that helps students study biology.

	If you use this work, please cite:
	```
	@misc{sonkar2024pedagogical,
	title={Pedagogical Alignment of Large Language Models},
	author={Shashank Sonkar and Kangqi Ni and Sapana Chaudhary and Richard G. Baraniuk},
	year={2024},
	eprint={2402.05000},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2402.05000}
	}
	```