Fine-Tuned BART Model for Text Classification on CNN News Articles

This is a fine-tuned BART (Bidirectional and Auto-Regressive Transformers) model for text classification on CNN news articles. The model was fine-tuned on a dataset of CNN news articles with labels indicating the article topic, using a batch size of 32, learning rate of 6e-5, and trained for one epoch.

How to Use

Install

pip install transformers

Example Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("IT-community/BART_cnn_news_text_classification")
model = AutoModelForSequenceClassification.from_pretrained("IT-community/BART_cnn_news_text_classification")

# Tokenize input text
text = "This is an example CNN news article about politics."
inputs = tokenizer(text, padding=True, truncation=True, max_length=512, return_tensors="pt")

# Make prediction
outputs = model(inputs["input_ids"], attention_mask=inputs["attention_mask"])
predicted_label = torch.argmax(outputs.logits)

print(predicted_label)

Evaluation

The model achieved the following performance metrics on the test set:

Accuracy: 0.9591836734693877

F1-score: 0.958301875401112

Recall: 0.9591836734693877

Precision: 0.9579673040369542

About Us

We are a scientific club from Saad Dahleb Blida University named IT Community, created in 2016 by students. We are interested in all IT fields, This work was done by IT Community Club.

Contributions

Cherguelaine Ayoub:

Added preprocessing code for CNN news articles
Improved model performance with additional fine-tuning on a larger dataset

IT-community
/

BART_cnn_news_text_classification