File size: 3,515 Bytes
60daa2c dde0ec7 60daa2c dde0ec7 60daa2c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 |
---
language:
- en
metrics:
- accuracy
base_model:
- google-bert/bert-base-uncased
pipeline_tag: text-classification
library_name: transformers
---
# PLEASE CHECK ED FOR TOKEN
# Model Evaluation Guide
This document provides the necessary instructions to evaluate a pre-trained sequence classification model using a test dataset.
## Prerequisites
Before running the evaluation pipeline, ensure you have the following installed:
- Python 3.7+
- Required Python libraries
Install them by running:
```bash
pip install transformers datasets evaluate torch
```
## Dataset Information
The test dataset is hosted on the Hugging Face Hub under the namespace `CIS5190ml/Dataset`. The dataset should have the following structure:
- Column: `title`
- Column: `label`
Example entries:
- "Jack Carr's take on the late Tom Clancy..." (label: 0)
- "Feeding America CEO asks community to help..." (label: 0)
- "Trump's campaign rival decides between..." (label: 0)
## Model Information
The model being evaluated is hosted under the Hugging Face Hub namespace `CIS5190ml/bert4`.
## Evaluation Pipeline
The complete evaluation pipeline is provided in the file:
**Evaluation_Pipeline.ipynb**
This Jupyter Notebook walks you through the following steps:
1. Loading the pre-trained model and tokenizer
2. Loading and preprocessing the test dataset
3. Running predictions on the test data
4. Computing the evaluation metric (e.g., accuracy)
## Quick Start
Clone this repository and navigate to the directory:
```bash
git clone <repository-url>
cd <repository-directory>
```
Open the Jupyter Notebook:
```bash
jupyter notebook Evaluation_Pipeline.ipynb
```
Follow the step-by-step instructions in the notebook to evaluate the model.
## Code Example
Here is an overview of the evaluation pipeline used in the notebook:
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from datasets import load_dataset
import evaluate
import torch
from torch.utils.data import DataLoader
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("CIS5190ml/bert4")
model = AutoModelForSequenceClassification.from_pretrained("CIS5190ml/bert4")
# Load dataset
ds = load_dataset("CIS5190ml/test_20_rows", split="train")
# Preprocessing
def preprocess_function(examples):
return tokenizer(examples["title"], truncation=True, padding="max_length")
encoded_ds = ds.map(preprocess_function, batched=True)
encoded_ds = encoded_ds.remove_columns([col for col in encoded_ds.column_names if col not in ["input_ids", "attention_mask", "label"]])
encoded_ds.set_format("torch")
# Create DataLoader
test_loader = DataLoader(encoded_ds, batch_size=8)
# Evaluate
accuracy = evaluate.load("accuracy")
model.eval()
for batch in test_loader:
with torch.no_grad():
outputs = model(input_ids=batch["input_ids"], attention_mask=batch["attention_mask"])
preds = torch.argmax(outputs.logits, dim=-1)
accuracy.add_batch(predictions=preds, references=batch["label"])
final_accuracy = accuracy.compute()
print("Accuracy:", final_accuracy["accuracy"])
```
## Output
After running the pipeline, the evaluation metric (e.g., accuracy) will be displayed in the notebook output. Example:
```
Accuracy: 0.85
```
## Notes
* If your dataset or column names differ, update the relevant sections in the notebook.
* To use a different evaluation metric, modify the `evaluate.load()` function in the notebook.
* For any issues or questions, please feel free to reach out. |