Fine tuning Llama 2 by a generated dataset to respond sarcastically

The main idea behind the model is to add behaviour to an LLM so that for a given input(news headline) the model responds back with output(sarcastic_headline) in a funny, sarcastic way.
All the existing open datasets available related to sarcasm are either extracted from social media like twitter or reddit which were mostly replies to parent post or just a labelled dataset which have sarcastic, non-sarcastic sentences. we are looking for dataset which has normal sentence and corresponding sarcastic version for the model to understand. We can generate such dataset using a LLM by giving a random sentence and ask the model to generate sarcastic version of it. Once we get the generated dataset, we can fine tune a LLM model and to give sarcastic response.

Model Details

We are using Llama 2 13B version to generate the sarcastic sentence by using an appropriate prompt template, for the input sentences we are referring to a news headline category dataset. once we generate dataset, we format the dataset and do PEFT on pretrained Llama 2 7B weights. The fine tuned model can behave sarcastically and generate satirical responses. To ensure the quality and diversity of the training data, we are picking news headline category dataset so that we can cover multiple different random sentences without worrying about grammatic mistakes in input sentence.

Sorce Dataset: https://www.kaggle.com/datasets/rmisra/news-category-dataset
Dataset after ETL: https://github.com/SriRamGovardhanam/Sarcastic-Headline-Llama2/blob/main/formatted_headline_data.csv
Model type: LLM
Finetuned from model: Llama2 7B https://huggingface.co/TinyPixel/Llama-2-7B-bf16-sharded/tree/main

Model Fine tuning code

Huggingface team developed a python library autotrain-advanced with which we can fine tune any LLM with just one line of code. You can find python code to generate the data, to fine tune the model in below repo

Repository: https://github.com/SriRamGovardhanam/Sarcastic-Headline-Llama2
For code line by line breakdown refer: [Coming soon]

Uses

Enhanced Natural Language Understanding: In applications like chatbots or virtual assistants, a model trained to understand sarcasm can provide more contextually relevant responses, improving user interactions.
Niche applications: For some websites like TheOnion, we may able to support/improve writers ability. Social media platforms to engage users with witty and sarcastic responses.

Direct Use

Refer to the Inference code available in repo: https://github.com/SriRamGovardhanam/Sarcastic-Headline-Llama2

Downstream Use

Content Generation: In creative writing and content creation, the model can be used to inject humor and sarcasm into articles, scripts, advertisements, or marketing materials to make them more engaging.
Brand Persona: Some companies adopt a brand persona characterized by humor and sarcasm in their communications. The model can assist in maintaining this tone in marketing campaigns and customer interactions.
Social Media Engagement: Brands and influencers on social media may use the model to craft sarcastic posts or responses that resonate with their audience, leading to increased engagement and brand awareness.

Recommendations

There is a lot of room for improvement here. At ETL level, at the time of generating dataset, we can provide different prompts for different categories available to generate even better funny responses.
Dataset used here to fine tune have only 2100 examples, we can increase dataset size. At the time of fine tuning, because of GPU memory constraints, I have only performed 8 epochs - this can be increased.
I opted for news headline because of quality and diversity of the training data. If sole purpose of the model is more focussed on generating more enticing sarcastic news headline, then another better approach here would be generating news description 1st and generate headline for the description.

How to Get Started with the Model

For Fine tuning your own dataset, you can use the colab notebook files in this repo: https://github.com/SriRamGovardhanam/Sarcastic-Headline-Llama2
For a doing a quick inference on this model card, refer to Inference notebook in the same repo.

Training Details

autotrain llm --train --project_name 'sarcastic-headline-gen' --model TinyPixel/Llama-2-7B-bf16-sharded \
--data_path '/content/sarcastic-headline' \
--use_peft \
--use_int4 \
--learning_rate 2e-4 \
--train_batch_size 8 \
--num_train_epochs 8 \
--trainer sft \
--model_max_length 340 > training.log &

Training Data

Results

Input headline: mansoons are best for mosquitoes
Input Formatted Template to the fine tuned LLM:

You are a savage, disrespectful and witty agent. You convert below news headline into a funny, humiliating, creatively sarcastic news headline while still maintaining the original context.
### headline: mansoons are best for mosquitoes
### sarcastic_headline:

Output after Inferencing:

You are a savage, disrespectful and witty agent. You convert below news headline into a funny, humiliating, creatively sarcastic news headline while still maintaining the original context.
### headline: mansoons are best for mosquitoes
### sarcastic_headline:  Another Study Proves That Men's Sweaty Bums Are The Best Repellent Against Mosquitoes

Summary

The primary purpose of this model is often to generate humor and entertainment. It can be used in chatbots, virtual assistants, or social media platforms to engage users with witty and sarcastic responses.
One advantage of using Llama2 model instead of chat GPT for dataset generation is, OpenAI will not allow offensive words/ hate speech as rules for a model, even if we include them in prompt template, chatGPT will not produce brutal/ humiliating responses which is reasonable and ethical for such a big organization.
This advatange is a double edged sword, as some people cannot handle these type of responses and may consider them as harrasement/ offensive.

Model Objective

This model is not intended to target specific race, gender, region etc., Sole purpose of this model is to understand LLM's and tap the LLM's ability to entertain, engage.

Compute Infrastructure

Google colab pro is needed if you are planning to tune more than 5 epochs for a 2100 samples of model_max_length < 650.

Citation

The source dataset - news headlines are taken from https://www.kaggle.com/datasets/rmisra/news-category-dataset
Misra, Rishabh. "News Category Dataset." arXiv preprint arXiv:2209.11429 (2022).

Model Card Authors

Sriram Govardhanam
http://www.linkedin.com/in/SriRamGovardhanam

Model Card Contact

[email protected]