--- license: mit datasets: - rajpurkar/squad language: - en metrics: - accuracy base_model: - google-bert/bert-base-cased - distilbert/distilbert-base-uncased - google/electra-small-discriminator - distilbert/distilgpt2 - jinmang2/retro-reader pipeline_tag: question-answering library_name: transformers tags: - question-answering - SQuAD - BERT - DistilBERT - ELECTRA - GPT-2 - transformers - machine-learning - natural-language-processing --- # AAI-520 Final Project Models This repository contains the fine-tuned models developed for the AAI-520 Final Project: **SQuAD Q&A ChatBot**. The models are fine-tuned on the Stanford Question Answering Dataset (SQuAD) and are designed to facilitate question-answering tasks using various architectures. ## Authors - [Zain Ali](https://github.com/zainnobody) - [Ben Hopwood](https://github.com/AIBenHopwood) ## Table of Contents - [Introduction](#introduction) - [Available Models](#available-models) - [Model Details](#model-details) - [1. BERT-base-cased Model](#1-bert-base-cased-model) - [2. DistilBERT-base-uncased Model](#2-distilbert-base-uncased-model) - [3. DistilGPT-2 Model](#3-distilgpt-2-model) - [4. Retro-Reader Model](#4-retro-reader-model) - [5. ELECTRA Model](#5-electra-model) - [Usage](#usage) - [Installation](#installation) - [Loading a Model](#loading-a-model) - [Example Usage](#example-usage) - [Citations](#citations) - [License](#license) - [Acknowledgments](#acknowledgments) ## Introduction The models in this repository are part of a project aimed at developing a generative-based chatbot capable of engaging in multi-turn conversations, adapting to context, and handling a wide range of topics. By leveraging the SQuAD dataset, these models are fine-tuned to provide accurate and contextually relevant responses to user queries. ## Available Models The following fine-tuned models are available in this repository: 1. **BERT-base-cased Models** - [`fine_tuned_bert_base_cased_1000`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_bert_base_cased_1000) - [`fine_tuned_bert_base_cased_all`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_bert_base_cased_all) 2. **DistilBERT-base-uncased Model** - [`fine_tuned_distilbert_base_uncased_10000`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_distilbert_base_uncased_10000) 3. **DistilGPT-2 Model** - [`fine_tuned_distilgpt2_10000`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_distilgpt2_10000) 4. **Retro-Reader Models** - [`fine_tuned_retro-reader_intensive_1000`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_retro-reader_intensive_1000) - [`fine_tuned_retro-reader_intensive_5000`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_retro-reader_intensive_5000) - [`fine_tuned_retro-reader_sketchy_1000`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_retro-reader_sketchy_1000) 5. **ELECTRA Models** (Recommended) - [`fine_tuned_electra_model_1000`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_electra_model_1000) - [`fine_tuned_electra_model_5000`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_electra_model_5000) - [`fine_tuned_electra_model_20000`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_electra_model_20000) - [`fine_tuned_electra_model_all`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_electra_model_all) **Note**: We recommend using the ELECTRA models for the best performance. ## Model Details ### 1. BERT-base-cased Model **Description**: Fine-tuned the pre-trained `bert-base-cased` model on the SQuAD dataset for question-answering tasks. **Approach**: - **Initial Test**: Trained on a subset of 1,000 data points to validate the setup. - **Full Training**: Extended training to the entire dataset after successful initial testing. **Results**: - **Training Metrics**: - **Batch Size**: 8 - **Epochs**: 6 - **Observations**: - Model performance improved with more epochs but plateaued after a certain point. - Initial tests confirmed the feasibility of using BERT for the task. ### 2. DistilBERT-base-uncased Model **Description**: Utilized `distilbert-base-uncased`, a lighter and faster version of BERT, to reduce computational resources. **Approach**: - Trained on 10,000 data points due to resource constraints. - Adjusted the input formatting and preprocessing steps. **Results**: - **Challenges**: - Encountered low accuracy and performance issues. - Incompatibility with the Gradio frontend hindered deployment. - **Conclusion**: - The model did not meet the desired performance metrics. ### 3. DistilGPT-2 Model **Description**: Experimented with `distilgpt2` to test a generative approach to question answering. **Approach**: - Prepared input data by combining context and questions. - Fine-tuned the model with custom tokenization and data collators. **Results**: - **Evaluation Metrics**: - Achieved an evaluation loss but struggled with calculating F1 and accuracy due to memory issues. - **Challenges**: - Resource limitations prevented extensive evaluation. - Model did not perform satisfactorily for the question-answering task. ### 4. Retro-Reader Model **Description**: Implemented the [Retro-Reader](https://arxiv.org/abs/2001.09694) model, designed for machine reading comprehension tasks. **Approach**: - Trained both the Sketchy Reading and Intensive Reading components. - Conducted experiments with datasets of 1,000 and 5,000 data points. **Results**: - **Performance**: - Achieved low accuracy in both Sketchy and Intensive modes. - **Conclusion**: - The model did not yield better results compared to previous models. - Required more research and optimization to be effective. ### 5. ELECTRA Model **Description**: Adopted `ELECTRA` for its efficient learning capabilities and superior performance in language understanding tasks. **Approach**: - Trained on varying dataset sizes: 1,000, 5,000, 20,000, and the full dataset. - Utilized the `google/electra-small-discriminator` model. **Results**: - **Training Metrics**: - **Batch Size**: 8 - **Epochs**: 6 - **Observations**: - Consistent improvement in performance with larger training data. - ELECTRA outperformed previous models, becoming the preferred choice for deployment. ## Usage ### Installation To use these models, you need to have the `transformers` library installed: ```bash pip install transformers ``` ### Loading a Model You can load any of the models using the `from_pretrained` method: ```python from transformers import AutoTokenizer, AutoModelForQuestionAnswering model_name = "zainnobody/AAI-520-Final-Project-Models/fine_tuned_electra_model_all" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForQuestionAnswering.from_pretrained(model_name) ``` ### Example Usage ```python from transformers import pipeline model_name = "zainnobody/AAI-520-Final-Project-Models/fine_tuned_electra_model_all" qa_pipeline = pipeline("question-answering", model=model_name, tokenizer=model_name) context = "The Stanford Question Answering Dataset is a reading comprehension dataset consisting of questions posed by crowdworkers on a set of Wikipedia articles." question = "What does SQuAD stand for?" result = qa_pipeline(question=question, context=context) print(f"Answer: {result['answer']}") ``` **Output**: ``` Answer: Stanford Question Answering Dataset ``` ## Citations - Rajpurkar, P., Zhang, J., Lopyrev, K., & Liang, P. (2016). [SQuAD: 100,000+ Questions for Machine Comprehension of Text](https://arxiv.org/abs/1606.05250). *arXiv preprint arXiv:1606.05250*. - Clark, K., Luong, M. T., Le, Q. V., & Manning, C. D. (2020). [ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators](https://arxiv.org/abs/2003.10555). *arXiv preprint arXiv:2003.10555*. ## License This project is licensed under the MIT License - see the [LICENSE](https://github.com/zainnobody/AAI-520-Final-Project/blob/main/LICENSE) file for details. ## Acknowledgments - The models are trained and fine-tuned using resources from [Hugging Face](https://huggingface.co/). - OpenAI’s ChatGPT and GitHub CoPilot were used to create, iterate, and improve code documentation. All outputs were appropriately edited and improved by the authors in the final versions. --- For any questions or issues, please feel free to contact the authors or open an issue on the [GitHub repository](https://github.com/zainnobody/AAI-520-Final-Project).