veechan/Mistrain-7b-Tweet-Classification-FineTuned

Overview

This model, Mistral-7B-Chat-Finetune, is a fine-tuned version of the Mistral 7B model, specifically adapted for sentiment extraction from tweets. It was trained using the Tweet Sentiment Extraction dataset from Hugging Face.

Training Data

Dataset: Tweet Sentiment Extraction dataset from Hugging Face

1.Dataset Description: This dataset contains tweets labeled with their sentiment (positive, negative, or neutral).

Training Details

Model: Mistral 7B.

Fine-tuning Method: Supervised Fine-Tuning (SFT) using the SFTTrainer from the trl library.

Quantization: 4-bit quantization using bitsandbytes.

LoRA Configuration: QLoRA with lora_r=64, lora_alpha=16, and lora_dropout=0.1.

Training Arguments:

Output Directory: ./results.

Number of Training Epochs: 1.

Batch Size per GPU for Training: 4.

Batch Size per GPU for Evaluation: 4.

Gradient Accumulation Steps: 1.

Optimizer: paged_adamw_32bit.

Learning Rate: 2e-4.

Weight Decay: 0.001.

Learning Rate Scheduler: cosine.

Warmup Ratio: 0.03.

Model Performance

This model has been fine-tuned to extract sentiment from tweets effectively. It uses the text generation pipeline to generate responses that include the sentiment of the input prompt.

Usage

To use this model, you can load it from the Hugging Face Hub and utilize the text generation pipeline to extract sentiment from tweets.

Model Saving and Deployment

The model was saved using the save_pretrained method. It was pushed to the Hugging Face Hub for sharing and future use.