Model Card for Fine-Tuned Sentiment Analysis Model on IMDB Dataset

This model is a fine-tuned version of a pre-trained transformer model, specifically designed for sentiment analysis on the IMDB dataset. It classifies movie reviews as either "positive" or "negative" based on the sentiment expressed.

Model Details

Model Description

This is the model card of a 🤗 transformers model that has been fine-tuned on the IMDB dataset, consisting of 50,000 movie reviews. The model is designed to perform binary sentiment classification, identifying whether a given review expresses a positive or negative sentiment.

Developed by: Melek Sahlia
Funded by [optional]: Self-funded / [Optional sponsor name]
Shared by [optional]: Melek
Model type: SequenceClassification
Language(s) (NLP): English
License: [More Information Needed]
Finetuned from model [optional]: DistilBERT-base-uncased

Model Sources [optional]

Repository: [Link to the Hugging Face repository]
Paper [optional]: [Optional if there's a related paper]
Demo [optional]: [Link to a demo, if available]

Direct Use

This model can be directly used for sentiment analysis tasks, particularly for classifying movie reviews or similar texts as "positive" or "negative."

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import pipeline

pipe = pipeline("text-classification", model="MALEKSAHLIA/fine-tuned-sentiment-model-imdb")

Example usage

sentence = "This movie was fantastic!" result = nlp(sentence)

def evaluate_sentence(sentence, result): sentiment = "Positive" if result[0]['label'] == 'LABEL_1' else "Negative" # Assuming '1' indicates good sentiment print(f"'{sentence}' is classified as {sentiment}")

evaluate_sentence(sentence, result)

Training Details

Training Data

The model was fine-tuned on the IMDB dataset, which contains 50,000 movie reviews labeled as either positive or negative.

Training Procedure

Preprocessing [optional] Standard preprocessing steps included tokenization using the "distilbert-base-uncased" tokenizer.

Training Hyperparameters

training_args = TrainingArguments( output_dir="./results", evaluation_strategy="epoch", learning_rate=2e-5, per_device_train_batch_size=16, per_device_eval_batch_size=16, num_train_epochs=3, weight_decay=0.01, )

Evaluation

Testing Data, Factors & Metrics Testing Data

The model was evaluated on the IMDB test dataset, which consists of 2,000 movie reviews.

Factors

Evaluation focused on the binary classification of sentiment in movie reviews.

Metrics

The following metrics were used to evaluate the model (Finetuned model):

Accuracy: 94% Precision: 93% Recall: 94% F1-Score: 94% Evaluation Loss: 0.29 Results

The following results were observed for the non finetuned model: Accuracy: 50% Precision: 78% Recall: 2% F1-Score: 2% Evaluation Loss: 0.67 These results show that the fine-tuned model significantly outperforms the pre-trained model in terms of accuracy, precision, recall, and F1-score.

Visualization Here is a comparison of the fine-tuned model versus the pre-trained model on various metrics:

import gradio as gr from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer

Load the model from the hub

model = AutoModelForSequenceClassification.from_pretrained("MALEKSAHLIA/fine-tuned-sentiment-model-imdb") tokenizer = AutoTokenizer.from_pretrained("MALEKSAHLIA/fine-tuned-sentiment-model-imdb")

Create a pipeline for sentiment analysis

nlp = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)

def predict_sentiment(sentence): result = nlp(sentence) sentiment = "Positive" if result[0]['label'] == 'LABEL_1' else "Negative" # Adjust the label to match your model's output return sentiment

iface = gr.Interface( fn=predict_sentiment, inputs="text", outputs="text", title="Sentiment Analysis", description="Enter a sentence to get the sentiment (Positive or Negative)." )

iface.launch()

This model card is now more complete and ready to be shared with others on Hugging Face Hub.

MALEKSAHLIA
/

fine-tuned-sentiment-model-imdb