Fine tuning Llama 2 by a generated dataset to respond sarcastically
The main idea behind the model is to add behaviour to an LLM so that for a given input(news headline) the model responds back with output(sarcastic_headline) in a funny,
sarcastic way.
All the existing open datasets available related to sarcasm are either extracted from social media like twitter or reddit which were mostly replies to parent post or
just a labelled dataset which have sarcastic, non-sarcastic sentences. we are looking for dataset which has normal sentence and corresponding sarcastic version for the model to understand.
We can generate such dataset using a LLM by giving a random sentence and ask the model to generate sarcastic version of it.
Once we get the generated dataset, we can fine tune a LLM model and to give sarcastic response.
Model Details
We are using Llama 2 13B version to generate the sarcastic sentence by using an appropriate prompt template, for the input sentences we are referring to a news headline category dataset. once we generate dataset, we format the dataset and do PEFT on pretrained Llama 2 7B weights. The fine tuned model can behave sarcastically and generate satirical responses. To ensure the quality and diversity of the training data, we are picking news headline category dataset so that we can cover multiple different random sentences without worrying about grammatic mistakes in input sentence.
- Sorce Dataset: https://www.kaggle.com/datasets/rmisra/news-category-dataset
- Dataset after ETL: https://github.com/SriRamGovardhanam/Sarcastic-Headline-Llama2/blob/main/formatted_headline_data.csv
- Model type: LLM
- Finetuned from model: Llama2 7B https://huggingface.co/TinyPixel/Llama-2-7B-bf16-sharded/tree/main
Model Fine tuning code
Huggingface team developed a python library autotrain-advanced with which we can fine tune any LLM with just one line of code. You can find python code to generate the data, to fine tune the model in below repo
- Repository: https://github.com/SriRamGovardhanam/Sarcastic-Headline-Llama2
- For code line by line breakdown refer: [Coming soon]
Uses
- Enhanced Natural Language Understanding: In applications like chatbots or virtual assistants, a model trained to understand sarcasm can provide more contextually relevant responses, improving user interactions.
- Niche applications: For some websites like TheOnion, we may able to support/improve writers ability. Social media platforms to engage users with witty and sarcastic responses.
Direct Use
Refer to the Inference code available in repo: https://github.com/SriRamGovardhanam/Sarcastic-Headline-Llama2
Downstream Use
- Content Generation: In creative writing and content creation, the model can be used to inject humor and sarcasm into articles, scripts, advertisements, or marketing materials to make them more engaging.
- Brand Persona: Some companies adopt a brand persona characterized by humor and sarcasm in their communications. The model can assist in maintaining this tone in marketing campaigns and customer interactions.
- Social Media Engagement: Brands and influencers on social media may use the model to craft sarcastic posts or responses that resonate with their audience, leading to increased engagement and brand awareness.
Recommendations
- There is a lot of room for improvement here. At ETL level, at the time of generating dataset, we can provide different prompts for different categories available to generate even better funny responses.
- Dataset used here to fine tune have only 2100 examples, we can increase dataset size. At the time of fine tuning, because of GPU memory constraints, I have only performed 8 epochs - this can be increased.
- I opted for news headline because of quality and diversity of the training data. If sole purpose of the model is more focussed on generating more enticing sarcastic news headline, then another better approach here would be generating news description 1st and generate headline for the description.
How to Get Started with the Model
- For Fine tuning your own dataset, you can use the colab notebook files in this repo: https://github.com/SriRamGovardhanam/Sarcastic-Headline-Llama2
- For a doing a quick inference on this model card, refer to Inference notebook in the same repo.
Training Details
autotrain llm --train --project_name 'sarcastic-headline-gen' --model TinyPixel/Llama-2-7B-bf16-sharded \
--data_path '/content/sarcastic-headline' \
--use_peft \
--use_int4 \
--learning_rate 2e-4 \
--train_batch_size 8 \
--num_train_epochs 8 \
--trainer sft \
--model_max_length 340 > training.log &
Training Data
Results
Input headline: mansoons are best for mosquitoes
Input Formatted Template to the fine tuned LLM:
You are a savage, disrespectful and witty agent. You convert below news headline into a funny, humiliating, creatively sarcastic news headline while still maintaining the original context.
### headline: mansoons are best for mosquitoes
### sarcastic_headline:
Output after Inferencing:
You are a savage, disrespectful and witty agent. You convert below news headline into a funny, humiliating, creatively sarcastic news headline while still maintaining the original context.
### headline: mansoons are best for mosquitoes
### sarcastic_headline: Another Study Proves That Men's Sweaty Bums Are The Best Repellent Against Mosquitoes
Summary
- The primary purpose of this model is often to generate humor and entertainment. It can be used in chatbots, virtual assistants, or social media platforms to engage users with witty and sarcastic responses.
- One advantage of using Llama2 model instead of chat GPT for dataset generation is, OpenAI will not allow offensive words/ hate speech as rules for a model, even if we include them in prompt template, chatGPT will not produce brutal/ humiliating responses which is reasonable and ethical for such a big organization.
- This advatange is a double edged sword, as some people cannot handle these type of responses and may consider them as harrasement/ offensive.
Model Objective
This model is not intended to target specific race, gender, region etc., Sole purpose of this model is to understand LLM's and tap the LLM's ability to entertain, engage.
Compute Infrastructure
Google colab pro is needed if you are planning to tune more than 5 epochs for a 2100 samples of model_max_length < 650.
Citation
The source dataset - news headlines are taken from https://www.kaggle.com/datasets/rmisra/news-category-dataset
Misra, Rishabh. "News Category Dataset." arXiv preprint arXiv:2209.11429 (2022).
Model Card Authors
Sriram Govardhanam
http://www.linkedin.com/in/SriRamGovardhanam
Model Card Contact
- Downloads last month
- 12