---
language:
- en
metrics:
- accuracy
base_model:
- google-bert/bert-base-uncased
pipeline_tag: text-classification
library_name: transformers
---

# PLEASE CHECK ED FOR TOKEN


# Model Evaluation Guide

This document provides the necessary instructions to evaluate a pre-trained sequence classification model using a test dataset.

## Prerequisites

Before running the evaluation pipeline, ensure you have the following installed:

- Python 3.7+
- Required Python libraries  
  Install them by running:

```bash
pip install transformers datasets evaluate torch
```

## Dataset Information

The test dataset is hosted on the Hugging Face Hub under the namespace `CIS5190ml/Dataset`. The dataset should have the following structure:
- Column: `title`
- Column: `label`

Example entries:
- "Jack Carr's take on the late Tom Clancy..." (label: 0)
- "Feeding America CEO asks community to help..." (label: 0)
- "Trump's campaign rival decides between..." (label: 0)

## Model Information

The model being evaluated is hosted under the Hugging Face Hub namespace `CIS5190ml/bert4`.

## Evaluation Pipeline

The complete evaluation pipeline is provided in the file:
**Evaluation_Pipeline.ipynb**

This Jupyter Notebook walks you through the following steps:
1. Loading the pre-trained model and tokenizer
2. Loading and preprocessing the test dataset
3. Running predictions on the test data
4. Computing the evaluation metric (e.g., accuracy)

## Quick Start

Clone this repository and navigate to the directory:

```bash
git clone <repository-url>
cd <repository-directory>
```

Open the Jupyter Notebook:

```bash
jupyter notebook Evaluation_Pipeline.ipynb
```

Follow the step-by-step instructions in the notebook to evaluate the model.

## Code Example

Here is an overview of the evaluation pipeline used in the notebook:

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from datasets import load_dataset
import evaluate
import torch
from torch.utils.data import DataLoader

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("CIS5190ml/bert4")
model = AutoModelForSequenceClassification.from_pretrained("CIS5190ml/bert4")

# Load dataset
ds = load_dataset("CIS5190ml/test_20_rows", split="train")

# Preprocessing
def preprocess_function(examples):
    return tokenizer(examples["title"], truncation=True, padding="max_length")

encoded_ds = ds.map(preprocess_function, batched=True)
encoded_ds = encoded_ds.remove_columns([col for col in encoded_ds.column_names if col not in ["input_ids", "attention_mask", "label"]])
encoded_ds.set_format("torch")

# Create DataLoader
test_loader = DataLoader(encoded_ds, batch_size=8)

# Evaluate
accuracy = evaluate.load("accuracy")
model.eval()

for batch in test_loader:
    with torch.no_grad():
        outputs = model(input_ids=batch["input_ids"], attention_mask=batch["attention_mask"])
        preds = torch.argmax(outputs.logits, dim=-1)
        accuracy.add_batch(predictions=preds, references=batch["label"])

final_accuracy = accuracy.compute()
print("Accuracy:", final_accuracy["accuracy"])
```

## Output

After running the pipeline, the evaluation metric (e.g., accuracy) will be displayed in the notebook output. Example:

```
Accuracy: 0.85
```

## Notes

* If your dataset or column names differ, update the relevant sections in the notebook.
* To use a different evaluation metric, modify the `evaluate.load()` function in the notebook.
* For any issues or questions, please feel free to reach out.