metadata
license: mit
datasets:
- mikemcrae25/black_entity_classifier
language:
- en
metrics:
- accuracy
base_model:
- google-bert/bert-base-uncased
pipeline_tag: text-classification
π§ BERT Classifier for Black Article Detection
π Model Overview
This repository hosts a fine-tuned BERT model (bert-base-uncased
) for classifying newspaper articles based on their focus on Black people. The training dataset is also provided for reproducibility.
π Description
- Model: Fine-tuned
bert-base-uncased
- Training Data: 2,000 manually labeled sentences from historical newspaper articles (1960β1973)
- Inputs:
sentence
(string) - Outputs:
black_story
(0 or 1)
π Performance Metrics
- Training Accuracy: 93.5%
- Validation Accuracy: 91.2%
- Precision: 90.8%
- Recall: 92.1%
π Usage Instructions
from transformers import pipeline
classifier = pipeline("text-classification", model="mikemcrae/black-article-classifier")
result = classifier("Black activists led a peaceful protest downtown.")
print(result)
πΎ Training Dataset
- Hugging Face Dataset: mikemcrae/black-article-training-data
- CSV Columns:
sentence
,black_story
π Example Data Preview
sentence,black_story
"The Black Panthers organized a march for civil rights.",1
"The mayor discussed the city's budget for next year.",0
"Black students protested against segregation policies.",1
"Black car for sale.",0
## βοΈ Reproduction Instructions
```python
from datasets import load_dataset
from transformers import BertForSequenceClassification, Trainer, TrainingArguments
dataset = load_dataset("mikemcrae/black-article-training-data")
model = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=2)
π License
MIT License: Free to use with attribution.
MIT License Β© 2025 Mike McRae
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND.
β€οΈ Citation
@inproceedings{mcrae2025blackbert,
title={BERT Classifier for Black Article Detection},
author={Mike McRae},
year={2025},
url={https://huggingface.co/mikemcrae/black-article-classifier}
}