Model Card for Fine-Tuned Sentiment Analysis Model on IMDB Dataset
This model is a fine-tuned version of a pre-trained transformer model, specifically designed for sentiment analysis on the IMDB dataset. It classifies movie reviews as either "positive" or "negative" based on the sentiment expressed.
Model Details
Model Description
This is the model card of a ๐ค transformers model that has been fine-tuned on the IMDB dataset, consisting of 50,000 movie reviews. The model is designed to perform binary sentiment classification, identifying whether a given review expresses a positive or negative sentiment.
- Developed by: Melek Sahlia
- Funded by [optional]: Self-funded / [Optional sponsor name]
- Shared by [optional]: Melek
- Model type: SequenceClassification
- Language(s) (NLP): English
- License: [More Information Needed]
- Finetuned from model [optional]: DistilBERT-base-uncased
Model Sources [optional]
- Repository: [Link to the Hugging Face repository]
- Paper [optional]: [Optional if there's a related paper]
- Demo [optional]: [Link to a demo, if available]
Direct Use
This model can be directly used for sentiment analysis tasks, particularly for classifying movie reviews or similar texts as "positive" or "negative."
How to Get Started with the Model
Use the code below to get started with the model.
from transformers import pipeline
pipe = pipeline("text-classification", model="MALEKSAHLIA/fine-tuned-sentiment-model-imdb")
Example usage
sentence = "This movie was fantastic!" result = nlp(sentence)
def evaluate_sentence(sentence, result): sentiment = "Positive" if result[0]['label'] == 'LABEL_1' else "Negative" # Assuming '1' indicates good sentiment print(f"'{sentence}' is classified as {sentiment}")
evaluate_sentence(sentence, result)
Training Details
Training Data
The model was fine-tuned on the IMDB dataset, which contains 50,000 movie reviews labeled as either positive or negative.
Training Procedure
Preprocessing [optional] Standard preprocessing steps included tokenization using the "distilbert-base-uncased" tokenizer.
Training Hyperparameters
training_args = TrainingArguments( output_dir="./results", evaluation_strategy="epoch", learning_rate=2e-5, per_device_train_batch_size=16, per_device_eval_batch_size=16, num_train_epochs=3, weight_decay=0.01, )
Evaluation
Testing Data, Factors & Metrics Testing Data
The model was evaluated on the IMDB test dataset, which consists of 2,000 movie reviews.
Factors
Evaluation focused on the binary classification of sentiment in movie reviews.
Metrics
The following metrics were used to evaluate the model (Finetuned model):
Accuracy: 94% Precision: 93% Recall: 94% F1-Score: 94% Evaluation Loss: 0.29 Results
The following results were observed for the non finetuned model: Accuracy: 50% Precision: 78% Recall: 2% F1-Score: 2% Evaluation Loss: 0.67 These results show that the fine-tuned model significantly outperforms the pre-trained model in terms of accuracy, precision, recall, and F1-score.
Visualization Here is a comparison of the fine-tuned model versus the pre-trained model on various metrics:
import gradio as gr from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer
Load the model from the hub
model = AutoModelForSequenceClassification.from_pretrained("MALEKSAHLIA/fine-tuned-sentiment-model-imdb") tokenizer = AutoTokenizer.from_pretrained("MALEKSAHLIA/fine-tuned-sentiment-model-imdb")
Create a pipeline for sentiment analysis
nlp = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
def predict_sentiment(sentence): result = nlp(sentence) sentiment = "Positive" if result[0]['label'] == 'LABEL_1' else "Negative" # Adjust the label to match your model's output return sentiment
iface = gr.Interface( fn=predict_sentiment, inputs="text", outputs="text", title="Sentiment Analysis", description="Enter a sentence to get the sentiment (Positive or Negative)." )
iface.launch()
This model card is now more complete and ready to be shared with others on Hugging Face Hub.
- Downloads last month
- 2