---
license: mit
datasets:
- rajpurkar/squad
language:
- en
metrics:
- accuracy
base_model:
- google-bert/bert-base-cased
- distilbert/distilbert-base-uncased
- google/electra-small-discriminator
- distilbert/distilgpt2
- jinmang2/retro-reader
pipeline_tag: question-answering
library_name: transformers
tags:
- question-answering
- SQuAD
- BERT
- DistilBERT
- ELECTRA
- GPT-2
- transformers
- machine-learning
- natural-language-processing
---
# AAI-520 Final Project Models

This repository contains the fine-tuned models developed for the AAI-520 Final Project: **SQuAD Q&A ChatBot**. The models are fine-tuned on the Stanford Question Answering Dataset (SQuAD) and are designed to facilitate question-answering tasks using various architectures.

## Authors

- [Zain Ali](https://github.com/zainnobody)
- [Ben Hopwood](https://github.com/AIBenHopwood)

## Table of Contents

- [Introduction](#introduction)
- [Available Models](#available-models)
- [Model Details](#model-details)
  - [1. BERT-base-cased Model](#1-bert-base-cased-model)
  - [2. DistilBERT-base-uncased Model](#2-distilbert-base-uncased-model)
  - [3. DistilGPT-2 Model](#3-distilgpt-2-model)
  - [4. Retro-Reader Model](#4-retro-reader-model)
  - [5. ELECTRA Model](#5-electra-model)
- [Usage](#usage)
  - [Installation](#installation)
  - [Loading a Model](#loading-a-model)
  - [Example Usage](#example-usage)
- [Citations](#citations)
- [License](#license)
- [Acknowledgments](#acknowledgments)

## Introduction

The models in this repository are part of a project aimed at developing a generative-based chatbot capable of engaging in multi-turn conversations, adapting to context, and handling a wide range of topics. By leveraging the SQuAD dataset, these models are fine-tuned to provide accurate and contextually relevant responses to user queries.

## Available Models

The following fine-tuned models are available in this repository:

1. **BERT-base-cased Models**
   - [`fine_tuned_bert_base_cased_1000`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_bert_base_cased_1000)
   - [`fine_tuned_bert_base_cased_all`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_bert_base_cased_all)

2. **DistilBERT-base-uncased Model**
   - [`fine_tuned_distilbert_base_uncased_10000`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_distilbert_base_uncased_10000)

3. **DistilGPT-2 Model**
   - [`fine_tuned_distilgpt2_10000`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_distilgpt2_10000)

4. **Retro-Reader Models**
   - [`fine_tuned_retro-reader_intensive_1000`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_retro-reader_intensive_1000)
   - [`fine_tuned_retro-reader_intensive_5000`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_retro-reader_intensive_5000)
   - [`fine_tuned_retro-reader_sketchy_1000`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_retro-reader_sketchy_1000)

5. **ELECTRA Models** (Recommended)
   - [`fine_tuned_electra_model_1000`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_electra_model_1000)
   - [`fine_tuned_electra_model_5000`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_electra_model_5000)
   - [`fine_tuned_electra_model_20000`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_electra_model_20000)
   - [`fine_tuned_electra_model_all`](https://huggingface.co/zainnobody/AAI-520-Final-Project-Models/tree/main/fine_tuned_electra_model_all)

**Note**: We recommend using the ELECTRA models for the best performance.

## Model Details

### 1. BERT-base-cased Model

**Description**: Fine-tuned the pre-trained `bert-base-cased` model on the SQuAD dataset for question-answering tasks.

**Approach**:

- **Initial Test**: Trained on a subset of 1,000 data points to validate the setup.
- **Full Training**: Extended training to the entire dataset after successful initial testing.

**Results**:

- **Training Metrics**:
  - **Batch Size**: 8
  - **Epochs**: 6
  - **Observations**:
    - Model performance improved with more epochs but plateaued after a certain point.
    - Initial tests confirmed the feasibility of using BERT for the task.

### 2. DistilBERT-base-uncased Model

**Description**: Utilized `distilbert-base-uncased`, a lighter and faster version of BERT, to reduce computational resources.

**Approach**:

- Trained on 10,000 data points due to resource constraints.
- Adjusted the input formatting and preprocessing steps.

**Results**:

- **Challenges**:
  - Encountered low accuracy and performance issues.
  - Incompatibility with the Gradio frontend hindered deployment.
- **Conclusion**:
  - The model did not meet the desired performance metrics.

### 3. DistilGPT-2 Model

**Description**: Experimented with `distilgpt2` to test a generative approach to question answering.

**Approach**:

- Prepared input data by combining context and questions.
- Fine-tuned the model with custom tokenization and data collators.

**Results**:

- **Evaluation Metrics**:
  - Achieved an evaluation loss but struggled with calculating F1 and accuracy due to memory issues.
- **Challenges**:
  - Resource limitations prevented extensive evaluation.
  - Model did not perform satisfactorily for the question-answering task.

### 4. Retro-Reader Model

**Description**: Implemented the [Retro-Reader](https://arxiv.org/abs/2001.09694) model, designed for machine reading comprehension tasks.

**Approach**:

- Trained both the Sketchy Reading and Intensive Reading components.
- Conducted experiments with datasets of 1,000 and 5,000 data points.

**Results**:

- **Performance**:
  - Achieved low accuracy in both Sketchy and Intensive modes.
- **Conclusion**:
  - The model did not yield better results compared to previous models.
  - Required more research and optimization to be effective.

### 5. ELECTRA Model

**Description**: Adopted `ELECTRA` for its efficient learning capabilities and superior performance in language understanding tasks.

**Approach**:

- Trained on varying dataset sizes: 1,000, 5,000, 20,000, and the full dataset.
- Utilized the `google/electra-small-discriminator` model.

**Results**:

- **Training Metrics**:
  - **Batch Size**: 8
  - **Epochs**: 6
- **Observations**:
  - Consistent improvement in performance with larger training data.
  - ELECTRA outperformed previous models, becoming the preferred choice for deployment.

## Usage

### Installation

To use these models, you need to have the `transformers` library installed:

```bash
pip install transformers
```

### Loading a Model

You can load any of the models using the `from_pretrained` method:

```python
from transformers import AutoTokenizer, AutoModelForQuestionAnswering

model_name = "zainnobody/AAI-520-Final-Project-Models/fine_tuned_electra_model_all"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
```

### Example Usage

```python
from transformers import pipeline

model_name = "zainnobody/AAI-520-Final-Project-Models/fine_tuned_electra_model_all"

qa_pipeline = pipeline("question-answering", model=model_name, tokenizer=model_name)

context = "The Stanford Question Answering Dataset is a reading comprehension dataset consisting of questions posed by crowdworkers on a set of Wikipedia articles."
question = "What does SQuAD stand for?"

result = qa_pipeline(question=question, context=context)

print(f"Answer: {result['answer']}")
```

**Output**:

```
Answer: Stanford Question Answering Dataset
```

## Citations

- Rajpurkar, P., Zhang, J., Lopyrev, K., & Liang, P. (2016). [SQuAD: 100,000+ Questions for Machine Comprehension of Text](https://arxiv.org/abs/1606.05250). *arXiv preprint arXiv:1606.05250*.
- Clark, K., Luong, M. T., Le, Q. V., & Manning, C. D. (2020). [ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators](https://arxiv.org/abs/2003.10555). *arXiv preprint arXiv:2003.10555*.

## License

This project is licensed under the MIT License - see the [LICENSE](https://github.com/zainnobody/AAI-520-Final-Project/blob/main/LICENSE) file for details.

## Acknowledgments

- The models are trained and fine-tuned using resources from [Hugging Face](https://huggingface.co/).
- OpenAI’s ChatGPT and GitHub CoPilot were used to create, iterate, and improve code documentation. All outputs were appropriately edited and improved by the authors in the final versions.

---

For any questions or issues, please feel free to contact the authors or open an issue on the [GitHub repository](https://github.com/zainnobody/AAI-520-Final-Project).