jyw3 commited on
Commit
60daa2c
·
verified ·
1 Parent(s): ad2a19a

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +133 -0
README.md ADDED
@@ -0,0 +1,133 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ metrics:
5
+ - accuracy
6
+ base_model:
7
+ - google-bert/bert-base-uncased
8
+ pipeline_tag: text-classification
9
+ library_name: transformers
10
+ ---
11
+
12
+ # PLEASE CHECK ED FOR TOKEN
13
+
14
+
15
+
16
+
17
+
18
+
19
+ # Model Evaluation Guide
20
+
21
+ This document provides the necessary instructions to evaluate a pre-trained sequence classification model using a test dataset.
22
+
23
+ ## Prerequisites
24
+
25
+ Before running the evaluation pipeline, ensure you have the following installed:
26
+
27
+ - Python 3.7+
28
+ - Required Python libraries
29
+ Install them by running:
30
+
31
+ ```bash
32
+ pip install transformers datasets evaluate torch
33
+ ```
34
+
35
+ ## Dataset Information
36
+
37
+ The test dataset is hosted on the Hugging Face Hub under the namespace `CIS5190ml/Dataset`. The dataset should have the following structure:
38
+ - Column: `title`
39
+ - Column: `label`
40
+
41
+ Example entries:
42
+ - "Jack Carr's take on the late Tom Clancy..." (label: 0)
43
+ - "Feeding America CEO asks community to help..." (label: 0)
44
+ - "Trump's campaign rival decides between..." (label: 0)
45
+
46
+ ## Model Information
47
+
48
+ The model being evaluated is hosted under the Hugging Face Hub namespace `CIS5190ml/bert3`.
49
+
50
+ ## Evaluation Pipeline
51
+
52
+ The complete evaluation pipeline is provided in the file:
53
+ **Evaluation_Pipeline.ipynb**
54
+
55
+ This Jupyter Notebook walks you through the following steps:
56
+ 1. Loading the pre-trained model and tokenizer
57
+ 2. Loading and preprocessing the test dataset
58
+ 3. Running predictions on the test data
59
+ 4. Computing the evaluation metric (e.g., accuracy)
60
+
61
+ ## Quick Start
62
+
63
+ Clone this repository and navigate to the directory:
64
+
65
+ ```bash
66
+ git clone <repository-url>
67
+ cd <repository-directory>
68
+ ```
69
+
70
+ Open the Jupyter Notebook:
71
+
72
+ ```bash
73
+ jupyter notebook Evaluation_Pipeline.ipynb
74
+ ```
75
+
76
+ Follow the step-by-step instructions in the notebook to evaluate the model.
77
+
78
+ ## Code Example
79
+
80
+ Here is an overview of the evaluation pipeline used in the notebook:
81
+
82
+ ```python
83
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
84
+ from datasets import load_dataset
85
+ import evaluate
86
+ import torch
87
+ from torch.utils.data import DataLoader
88
+
89
+ # Load model and tokenizer
90
+ tokenizer = AutoTokenizer.from_pretrained("CIS5190ml/bert3")
91
+ model = AutoModelForSequenceClassification.from_pretrained("CIS5190ml/bert3")
92
+
93
+ # Load dataset
94
+ ds = load_dataset("CIS5190ml/test_20_rows", split="train")
95
+
96
+ # Preprocessing
97
+ def preprocess_function(examples):
98
+ return tokenizer(examples["title"], truncation=True, padding="max_length")
99
+
100
+ encoded_ds = ds.map(preprocess_function, batched=True)
101
+ encoded_ds = encoded_ds.remove_columns([col for col in encoded_ds.column_names if col not in ["input_ids", "attention_mask", "label"]])
102
+ encoded_ds.set_format("torch")
103
+
104
+ # Create DataLoader
105
+ test_loader = DataLoader(encoded_ds, batch_size=8)
106
+
107
+ # Evaluate
108
+ accuracy = evaluate.load("accuracy")
109
+ model.eval()
110
+
111
+ for batch in test_loader:
112
+ with torch.no_grad():
113
+ outputs = model(input_ids=batch["input_ids"], attention_mask=batch["attention_mask"])
114
+ preds = torch.argmax(outputs.logits, dim=-1)
115
+ accuracy.add_batch(predictions=preds, references=batch["label"])
116
+
117
+ final_accuracy = accuracy.compute()
118
+ print("Accuracy:", final_accuracy["accuracy"])
119
+ ```
120
+
121
+ ## Output
122
+
123
+ After running the pipeline, the evaluation metric (e.g., accuracy) will be displayed in the notebook output. Example:
124
+
125
+ ```
126
+ Accuracy: 0.85
127
+ ```
128
+
129
+ ## Notes
130
+
131
+ * If your dataset or column names differ, update the relevant sections in the notebook.
132
+ * To use a different evaluation metric, modify the `evaluate.load()` function in the notebook.
133
+ * For any issues or questions, please feel free to reach out.