metadata

license: mit
datasets:
  - mikemcrae25/black_entity_classifier
language:
  - en
metrics:
  - accuracy
base_model:
  - google-bert/bert-base-uncased
pipeline_tag: text-classification

🧠 BERT Classifier for Black Article Detection

📝 Model Overview

This repository hosts a fine-tuned BERT model (bert-base-uncased) for classifying newspaper articles based on their focus on Black people. The training dataset is also provided for reproducibility.

📖 Description

Model: Fine-tuned bert-base-uncased
Training Data: 2,000 manually labeled sentences from historical newspaper articles (1960–1973)
Inputs: sentence (string)
Outputs: black_story (0 or 1)

📊 Performance Metrics

Training Accuracy: 93.5%
Validation Accuracy: 91.2%
Precision: 90.8%
Recall: 92.1%

🚀 Usage Instructions

from transformers import pipeline
classifier = pipeline("text-classification", model="mikemcrae/black-article-classifier")
result = classifier("Black activists led a peaceful protest downtown.")
print(result)

💾 Training Dataset

Hugging Face Dataset: mikemcrae/black-article-training-data
CSV Columns: sentence, black_story

📊 Example Data Preview

sentence,black_story
"The Black Panthers organized a march for civil rights.",1
"The mayor discussed the city's budget for next year.",0
"Black students protested against segregation policies.",1
"Black car for sale.",0

## ⚙️ Reproduction Instructions
```python
from datasets import load_dataset
from transformers import BertForSequenceClassification, Trainer, TrainingArguments

dataset = load_dataset("mikemcrae/black-article-training-data")
model = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=2)

📜 License

MIT License: Free to use with attribution.

MIT License © 2025 Mike McRae
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND.

❤️ Citation

@inproceedings{mcrae2025blackbert,
  title={BERT Classifier for Black Article Detection},
  author={Mike McRae},
  year={2025},
  url={https://huggingface.co/mikemcrae/black-article-classifier}
}