MCQBert / README.md

Update README.md

8a736e4 verified 10 months ago

3.66 kB

	---
	library_name: transformers
	license: mit
	language:
	- de
	---

	# MCQBert Model Card

	MCQBert is a robust and versatile BERT-based model fine-tuned to predict the correct answers for multiple-choice questions (MCQs) within Intelligent Tutoring Systems (ITS). Using [LernnaviBERT](https://huggingface.co/epfl-ml4ed/LernnaviBERT) as a base model, MCQBert is able to understand and process educational language in German, especially in grammar teaching, where sentences contain mistakes. The model processes both the text of the questions and the answer to predict the correct response to a question and is designed to be fine-tuned on student's interactions for Student Answer Forecasting.
	It is trained on one objective: given a question and answer pair, classificate whether the answer is correct or not.

	### Model Sources

	- Repository: [https://github.com/epfl-ml4ed/answer-forecasting](https://github.com/epfl-ml4ed/answer-forecasting)
	- Paper: [https://arxiv.org/abs/2405.20079](https://arxiv.org/abs/2405.20079)

	### Direct Use

	MCQBert is primarily intended to predict correct answers to MCQs in Intelligent Tutoring Systems (ITS). Given a question and answer pair, it performs a binary classification to decide whether the answer is correct or not.

	### Downstream Use

	It's intended downstream use is to be finetuned on user interaction (like [MCQStudentBertCat](https://huggingface.co/epfl-ml4ed/MCQStudentBertCat) and [MCQStudentBertSum](https://huggingface.co/epfl-ml4ed/MCQStudentBertSum)) for Student Answer Forecasting as described in [https://arxiv.org/abs/2405.20079](https://arxiv.org/abs/2405.20079)

	## Bias, Risks, and Limitations

	While MCQBert is effective, it has some limitations:

	It is primarily trained on German language MCQs and may not generalize well to other languages or subjects without further fine-tuning.
	The model may not capture all nuances of student learning behavior, particularly in diverse educational contexts.

	Privacy: No personally identifiable information has been used in any training phase.

	## How to Use MCQBert

	```python
	import torch
	from transformers import AutoModel, AutoTokenizer

	device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

	# load MCQStudentBert
	model_bert = AutoModel.from_pretrained("epfl-ml4ed/MCQBert", trust_remote_code=True, token=token).to(device)
	tokenizer_bert = AutoTokenizer.from_pretrained("dbmdz/bert-base-german-uncased")

	qna = f"Q: my_question {tokenizer.sep_token}A: candidate_answer"
	output = torch.nn.functional.sigmoid(
	model_bert(
	tokenizer_bert(qna, return_tensors="pt").input_ids.to(device),
	).cpu()
	).item() > 0.5

	print(output)
	```

	## Training Details

	The model was trained on questions from a real-world ITS, Lernnavi, for 20k steps with a batch size of 16. The optimizer used is AdamW with learning rate = 1.75e-5, \\(\beta_{1} = 0.9\\) and \\(\beta_{2} = 0.999\\), and a weight decay of 0.01


	## Citation

	If you find this useful in your work, please cite our paper

	```
	@misc{gado2024student,
	title={Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning},
	author={Elena Grazia Gado and Tommaso Martorella and Luca Zunino and Paola Mejia-Domenzain and Vinitra Swamy and Jibril Frej and Tanja Käser},
	year={2024},
	eprint={2405.20079},
	archivePrefix={arXiv},
	}
	```

	```
	Gado, E., Martorella, T., Zunino, L., Mejia-Domenzain, P., Swamy, V., Frej, J., Käser, T. (2024).
	Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning.
	In: Proceedings of the Conference on Educational Data Mining (EDM 2024).
	```