--- license: llama3.2 datasets: - stanfordnlp/imdb language: - en metrics: - accuracy base_model: - meta-llama/Llama-3.2-1B new_version: yash3056/Llama-3.2-1B-imdb pipeline_tag: text-classification library_name: transformers tags: - transformers - pytorch - llama - llama-3 - 1b --- ## Model Details ### Model Description - **Funded by [Intel]:** [https://console.cloud.intel.com/] - **Shared by [optional]:** [More Information Needed] - **Model type:** Text Classification - **Language(s) (NLP):** [More Information Needed] - **License:** [Llama 3.2 Community License Agreement] - **Finetuned from model [meta-llama/Llama-3.2-1B]:** [https://huggingface.co/meta-llama/Llama-3.2-1B] ## Uses This model is designed for text classification tasks, specifically for binary sentiment analysis on datasets like IMDb, where the goal is to classify text as positive or negative. It can be used by data scientists, researchers, and developers to build applications for sentiment analysis, content moderation, or customer feedback analysis. The model can be fine-tuned for other binary or multi-class classification tasks in domains like social media monitoring, product reviews, and support ticket triage. Foreseeable users include AI researchers, developers, and businesses looking to automate text analysis at scale. ### Direct Use This model can be used directly to identify sentiments from text-based reviews, such as classifying whether a movie or product review is positive or negative. Without any further fine-tuning, it performs well on binary sentiment analysis tasks and can be employed out of the box for various applications like analyzing customer feedback, monitoring social media opinions, or automating sentiment tagging. The model is ideal for scenarios where sentiment needs to be quickly assessed from textual input without the need for deeper customizations. ### Downstream Use *Fine-tuning for Binary Classification* ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments from datasets import load_dataset # Load IMDb dataset for binary classification dataset = load_dataset("imdb") tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased") # Tokenize the dataset def preprocess(example): return tokenizer(example['text'], truncation=True, padding='max_length', max_length=128) tokenized_datasets = dataset.map(preprocess, batched=True) # Load model for binary classification (num_labels=2) model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=2) # Training arguments training_args = TrainingArguments( output_dir="./results", evaluation_strategy="epoch", learning_rate=2e-5, per_device_train_batch_size=16, per_device_eval_batch_size=16, num_train_epochs=3, weight_decay=0.01, ) # Trainer trainer = Trainer( model=model, args=training_args, train_dataset=tokenized_datasets["train"], eval_dataset=tokenized_datasets["test"], ) # Fine-tune the model trainer.train() ``` *Fine-tuning for Multi-Class Classification* ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments from datasets import load_dataset # Load AG News dataset for multi-class classification (4 labels) dataset = load_dataset("ag_news") tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased") # Tokenize the dataset def preprocess(example): return tokenizer(example['text'], truncation=True, padding='max_length', max_length=128) tokenized_datasets = dataset.map(preprocess, batched=True) # Load model for multi-class classification (num_labels=4) model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=4) # Training arguments training_args = TrainingArguments( output_dir="./results", evaluation_strategy="epoch", learning_rate=2e-5, per_device_train_batch_size=16, per_device_eval_batch_size=16, num_train_epochs=3, weight_decay=0.01, ) # Trainer trainer = Trainer( model=model, args=training_args, train_dataset=tokenized_datasets["train"], eval_dataset=tokenized_datasets["test"], ) # Fine-tune the model trainer.train() ``` [More Information Needed] --> ## Bias, Risks, and Limitations While this model is effective for text classification and sentiment analysis, it has certain limitations and potential biases. The training data, such as the IMDb dataset, may contain inherent biases related to language use, cultural context, or demographics of reviewers, which could influence the model’s predictions. For example, the model might struggle with nuanced sentiment, sarcasm, or slang, leading to misclassifications. Additionally, it could exhibit biases toward particular opinions or groups if those were overrepresented or underrepresented in the training data. The model is also limited to binary sentiment classification, meaning it may oversimplify more complex emotional states expressed in text. Users should be cautious when applying the model in sensitive domains such as legal, medical, or psychological settings, where misclassification could have serious consequences. Proper review and adjustment of predictions are recommended, especially in high-stakes applications. ### Recommendations Users (both direct and downstream) should be aware of the potential risks, biases, and limitations inherent in this model. Given that the model may reflect biases present in the training data, it is recommended that users critically evaluate the model’s performance on specific datasets or contexts where fairness and accuracy are essential. For applications in sensitive areas like legal, healthcare, or hiring decisions, additional care should be taken to review the model's predictions, possibly combining them with human oversight. Fine-tuning the model on domain-specific data or implementing bias mitigation techniques can help reduce unintended bias. Additionally, regular re-evaluation and monitoring of the model in production environments are encouraged to ensure it continues to meet desired ethical and performance standards. ## How to Get Started with the Model Use the code below to get started with the model. ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification # Load Model and tokenizers tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased") model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=n) #n is the number of labels in the code ``` ## Training Details ### Training Data The model was trained on the IMDb dataset, a widely used benchmark for binary sentiment classification tasks. The dataset consists of movie reviews labeled as positive or negative, making it suitable for training models to understand sentiment in text. The dataset contains 50,000 reviews in total, evenly split between positive and negative labels, providing a balanced dataset for training and evaluation. Preprocessing involved tokenizing the text using the AutoTokenizer from Hugging Face's Transformers library, truncating and padding the sequences to a maximum length of 512 tokens. The training data was further split into training and validation sets with an 80-20 ratio. More information about the IMDb dataset can be found [here](https://huggingface.co/datasets/stanfordnlp/imdb). ### Training Procedure Training Procedure The training procedure used the Llama-3.2-1B model with modifications to suit the binary sentiment classification task. Training was performed for 10 epochs using a batch size of 8 and the AdamW optimizer with a learning rate of 3e-5. The learning rate was adjusted with a linear schedule, including a warmup of 40% of the total steps. The model was fine-tuned using the IMDb training dataset and evaluated on a separate test set. Validation and evaluation metrics were calculated after each epoch, including accuracy, precision, recall, F1-score, and ROC-AUC. The final model was saved after the last epoch, along with the tokenizer. Several plots, such as loss curves, accuracy curves, confusion matrix, and ROC curve, were generated to visually assess the model's performance. #### Preprocessing [optional] Text data was preprocessed by tokenizing with the Llama-3.2-1B model tokenizer. Sequences were truncated and padded to a maximum length of 512 tokens to ensure consistent input sizes for the model. Labels were encoded as integers (0 for negative and 1 for positive) for compatibility with the model. ## Evaluation Training Loss: 0.0030, Accuracy: 0.9999 Validation Loss: 0.1196, Accuracy: 0.9628 ### Testing Data, Factors & Metrics #### Testing Data Test Loss: 0.1315 Test Accuracy: 0.9604 Precision: 0.9604 Recall: 0.9604 F1-score: 0.9604 AUC: 0.9604 #### Summary ## Technical Specifications #### Hardware [Intel® Data Center GPU Max 1550](https://www.intel.com/content/www/us/en/products/sku/232873/intel-data-center-gpu-max-1550/specifications.html) ## Model Card Authors -Yash Prakash Narayan ([github](https://github.com/yash3056))