Huggingface Model: BART-MNLI-ZeroShot-Text-Classification

This is a Huggingface model fine-tuned on the CNN news dataset for zero-shot text classification task using BART-MNLI. The model achieved an f1 score of 94% and an accuracy of 94% on the CNN test dataset with a maximum length of 128 tokens.

Authors

This work was done by CHERGUELAINE Ayoub & BOUBEKRI Faycal

Original Model

facebook/bart-large-mnli

Model Architecture

The BART-Large-MNLI model has 12 transformer layers, a hidden size of 1024, and 406 million parameters. It is pre-trained on the English Wikipedia and BookCorpus datasets, and fine-tuned on the Multi-Genre Natural Language Inference (MNLI) task.

Dataset

The CNN news dataset was used for fine-tuning the model. This dataset contains news articles from the CNN website and is labeled into 6 categories, including politics, health, entertainment, tech, travel, world, and sports.

Fine-tuning Parameters

The model was fine-tuned for 1 epoch on a maximum length of 256 tokens. The training took approximately 6 hours to complete.

Evaluation Metrics

The model achieved an f1 score of 94% and an accuracy of 94% on the CNN test dataset with a maximum length of 128 tokens.

Usage

The model can be used for zero-shot text classification tasks on news articles. It can be accessed via the Huggingface Transformers library using the following code:

from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("AyoubChLin/Bart-MNLI-CNN_news")

model = AutoModelForSequenceClassification.from_pretrained("AyoubChLin/Bart-MNLI-CNN_news")
classifier = pipeline(
    "zero-shot-classification",
    model=model,
    tokenizer=tokenizer,
    device=0
)

Acknowledgments

We would like to acknowledge the Huggingface team for their open-source implementation of transformer models and the CNN news dataset for providing the labeled dataset for fine-tuning.

Downloads last month
47
Safetensors
Model size
407M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train AyoubChLin/Bart-MNLI-CNN_news

Space using AyoubChLin/Bart-MNLI-CNN_news 1