|
--- |
|
library_name: transformers |
|
license: mit |
|
language: |
|
- de |
|
--- |
|
|
|
# MCQBert Model Card |
|
|
|
MCQBert is a robust and versatile BERT-based model fine-tuned to predict the correct answers for multiple-choice questions (MCQs) within Intelligent Tutoring Systems (ITS). Using [LernnaviBERT](https://huggingface.co/epfl-ml4ed/LernnaviBERT) as a base model, MCQBert is able to understand and process educational language in German, especially in grammar teaching, where sentences contain mistakes. The model processes both the text of the questions and the answer to predict the correct response to a question and is designed to be fine-tuned on student's interactions for Student Answer Forecasting. |
|
It is trained on one objective: given a question and answer pair, classificate whether the answer is correct or not. |
|
|
|
### Model Sources |
|
|
|
- **Repository:** [https://github.com/epfl-ml4ed/answer-forecasting](https://github.com/epfl-ml4ed/answer-forecasting) |
|
- **Paper:** [https://arxiv.org/abs/2405.20079](https://arxiv.org/abs/2405.20079) |
|
|
|
### Direct Use |
|
|
|
MCQBert is primarily intended to predict correct answers to MCQs in Intelligent Tutoring Systems (ITS). Given a question and answer pair, it performs a binary classification to decide whether the answer is correct or not. |
|
|
|
### Downstream Use |
|
|
|
It's intended downstream use is to be finetuned on user interaction (like [MCQStudentBertCat](https://huggingface.co/epfl-ml4ed/MCQStudentBertCat) and [MCQStudentBertSum](https://huggingface.co/epfl-ml4ed/MCQStudentBertSum)) for Student Answer Forecasting as described in [https://arxiv.org/abs/2405.20079](https://arxiv.org/abs/2405.20079) |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
While MCQBert is effective, it has some limitations: |
|
|
|
It is primarily trained on German language MCQs and may not generalize well to other languages or subjects without further fine-tuning. |
|
The model may not capture all nuances of student learning behavior, particularly in diverse educational contexts. |
|
|
|
Privacy: No personally identifiable information has been used in any training phase. |
|
|
|
## How to Use MCQBert |
|
|
|
```python |
|
import torch |
|
from transformers import AutoModel, AutoTokenizer |
|
|
|
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
|
|
|
# load MCQStudentBert |
|
model_bert = AutoModel.from_pretrained("epfl-ml4ed/MCQBert", trust_remote_code=True, token=token).to(device) |
|
tokenizer_bert = AutoTokenizer.from_pretrained("dbmdz/bert-base-german-uncased") |
|
|
|
qna = f"Q: my_question {tokenizer.sep_token}A: candidate_answer" |
|
output = torch.nn.functional.sigmoid( |
|
model_bert( |
|
tokenizer_bert(qna, return_tensors="pt").input_ids.to(device), |
|
).cpu() |
|
).item() > 0.5 |
|
|
|
print(output) |
|
``` |
|
|
|
## Training Details |
|
|
|
The model was trained on questions from a real-world ITS, Lernnavi, for 20k steps with a batch size of 16. The optimizer used is AdamW with learning rate = 1.75e-5, \\(\beta_{1} = 0.9\\) and \\(\beta_{2} = 0.999\\), and a weight decay of 0.01 |
|
|
|
|
|
## Citation |
|
|
|
If you find this useful in your work, please cite our paper |
|
|
|
``` |
|
@misc{gado2024student, |
|
title={Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning}, |
|
author={Elena Grazia Gado and Tommaso Martorella and Luca Zunino and Paola Mejia-Domenzain and Vinitra Swamy and Jibril Frej and Tanja Käser}, |
|
year={2024}, |
|
eprint={2405.20079}, |
|
archivePrefix={arXiv}, |
|
} |
|
``` |
|
|
|
``` |
|
Gado, E., Martorella, T., Zunino, L., Mejia-Domenzain, P., Swamy, V., Frej, J., Käser, T. (2024). |
|
Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning. |
|
In: Proceedings of the Conference on Educational Data Mining (EDM 2024). |
|
``` |
|
|
|
|