PLEASE CHECK ED FOR TOKEN

Model Evaluation Guide

This document provides the necessary instructions to evaluate a pre-trained sequence classification model using a test dataset.

Prerequisites

Before running the evaluation pipeline, ensure you have the following installed:

  • Python 3.7+
  • Required Python libraries
    Install them by running:
pip install transformers datasets evaluate torch

Dataset Information

The test dataset is hosted on the Hugging Face Hub under the namespace CIS5190ml/Dataset. The dataset should have the following structure:

  • Column: title
  • Column: label

Example entries:

  • "Jack Carr's take on the late Tom Clancy..." (label: 0)
  • "Feeding America CEO asks community to help..." (label: 0)
  • "Trump's campaign rival decides between..." (label: 0)

Model Information

The model being evaluated is hosted under the Hugging Face Hub namespace CIS5190ml/bert4.

Evaluation Pipeline

The complete evaluation pipeline is provided in the file: Evaluation_Pipeline.ipynb

This Jupyter Notebook walks you through the following steps:

  1. Loading the pre-trained model and tokenizer
  2. Loading and preprocessing the test dataset
  3. Running predictions on the test data
  4. Computing the evaluation metric (e.g., accuracy)

Quick Start

Clone this repository and navigate to the directory:

git clone <repository-url>
cd <repository-directory>

Open the Jupyter Notebook:

jupyter notebook Evaluation_Pipeline.ipynb

Follow the step-by-step instructions in the notebook to evaluate the model.

Code Example

Here is an overview of the evaluation pipeline used in the notebook:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
from datasets import load_dataset
import evaluate
import torch
from torch.utils.data import DataLoader

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("CIS5190ml/bert4")
model = AutoModelForSequenceClassification.from_pretrained("CIS5190ml/bert4")

# Load dataset
ds = load_dataset("CIS5190ml/test_20_rows", split="train")

# Preprocessing
def preprocess_function(examples):
    return tokenizer(examples["title"], truncation=True, padding="max_length")

encoded_ds = ds.map(preprocess_function, batched=True)
encoded_ds = encoded_ds.remove_columns([col for col in encoded_ds.column_names if col not in ["input_ids", "attention_mask", "label"]])
encoded_ds.set_format("torch")

# Create DataLoader
test_loader = DataLoader(encoded_ds, batch_size=8)

# Evaluate
accuracy = evaluate.load("accuracy")
model.eval()

for batch in test_loader:
    with torch.no_grad():
        outputs = model(input_ids=batch["input_ids"], attention_mask=batch["attention_mask"])
        preds = torch.argmax(outputs.logits, dim=-1)
        accuracy.add_batch(predictions=preds, references=batch["label"])

final_accuracy = accuracy.compute()
print("Accuracy:", final_accuracy["accuracy"])

Output

After running the pipeline, the evaluation metric (e.g., accuracy) will be displayed in the notebook output. Example:

Accuracy: 0.85

Notes

  • If your dataset or column names differ, update the relevant sections in the notebook.
  • To use a different evaluation metric, modify the evaluate.load() function in the notebook.
  • For any issues or questions, please feel free to reach out.
Downloads last month
121
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for CIS5190ml/bert4

Finetuned
(2420)
this model