--- base_model: - unsloth/llama-2-7b-bnb-4bit - hermeschen1116/response_generator_for_emotion_chat_bot library_name: peft license: apache-2.0 datasets: - Shotaro30678/rlhf-RG-trl-style-v3 tags: - trl - unsloth language: - en pipeline_tag: text-generation --- # Response Generator for [Emotion Chat Bot](https://github.com/hermeschen1116/chat-bot) ## Model description This model is a dpo fine-tuned version of [hermeschen1116/response_generator_for_emotion_chat_bot](https://huggingface.co/hermeschen1116/response_generator_for_emotion_chat_bot) on [Shotaro30678/rlhf-RG-trl-style-v3](https://huggingface.co/datasets/Shotaro30678/rlhf-RG-trl-style-v3), self modified version of [daily_dialog](li2017dailydialog/daily_dialog). ## Intended uses & limitations Use dpo trainer to do the RLHF so that the model can be more precise and consistent. ## Model performance **Sentiment Score:** **[Shotaro30678/emotion_text_classifier_on_dd_v1](https://huggingface.co/Shotaro30678/emotion_text_classifier_on_dd_v1)** | **Metric** | **DPO Trained Model** | **SFT Model (Reference)** | |--------------|:----------------------:|:--------------------------:| | **Accuracy** | 0.851 | 0.788 | | **F1-score** | 0.8564 | 0.7975 | **Gibberish Distribution:** **[madhurjindal/autonlp-Gibberish-Detector-492513457](https://huggingface.co/madhurjindal/autonlp-Gibberish-Detector-492513457)** | **Category** | **DPO Trained Model** | **SFT Model (Reference)** | |---------------------|:----------------------:|:--------------------------:| | **Clean** | 882 | 898 | | **Mild Gibberish** | 94 | 58 | | **Word Salad** | 21 | 33 | | **Noise** | 3 | 11 | **Cut-Off Output:** | **Output Type** | **DPO Trained Model** | **SFT Model (Reference)** | |---------------------|:----------------------:|:--------------------------:| | **Complete Output** | 985 | 975 | | **Incomplete Output** | 15 | 25 | on [hermeschen1116/daily_dialog_for_RG](https://huggingface.co/datasets/hermeschen1116/daily_dialog_for_RG) test split. **test on config:** ```python generation_config = GenerationConfig( max_new_tokens=150, min_new_tokens=5, repetition_penalty=1.1, top_k=3, top_p=0.9, pad_token_id=tokenizer.pad_token_id, eos_token_id=tokenizer.eos_token_id, temperature=1.0, do_sample=True, num_beams=1 ) ``` ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - beta=0.1, - remove_unused_columns=False, - num_train_epochs=3, - gradient_checkpointing=True others remain default ### Framework versions - Bitsandbytes 0.43.1 - Datasets 2.20.0 - PEFT 0.11.1 - Pytorch 2.3.0+cu121 - Transformers 4.42.4 - Tokenizers 0.19.1 - Trl 0.8.6 - unsloth 2024.7 0f2e484 # Uploaded model - **Developed by:** Shotaro30678 - **Finetuned from model :** hermeschen1116/response_generator_for_emotion_chat_bot This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. [](https://github.com/unslothai/unsloth) # Quick sample ```python # libs are from github repo from libs import ResponseGeneratorPipeline from unsloth import FastLanguageModel model, tokenizer = FastLanguageModel.from_pretrained( model_name = "Shotaro30678/response_generator_DPO", # YOUR MODEL YOU USED FOR TRAINING load_in_4bit = True, ) FastLanguageModel.for_inference(model) # Enable native 2x faster inference bot = ResponseGeneratorPipeline( model, tokenizer, framework="pt", task="conversation-generation", num_workers=16, torch_dtype="auto", add_special_tokens=True, truncation=False, padding=True ) conversation = [ {'content': {'dialog': '', 'emotion': ''}, 'role': 'system'}, {'content': {'dialog': 'Can you do push-ups ?', 'emotion': 'neutral'}, 'role': 'user'}, {'content': {'dialog': "Of course I can . It's a piece of cake ! Believe it or not , I can do 30 push-ups a minute .", 'emotion': 'neutral'}, 'role': 'assistant'}, {'content': {'dialog': "Really ? I think that's impossible !", 'emotion': 'surprise'}, 'role': 'user'}, {'content': {'dialog': 'You mean 30 push-ups ?', 'emotion': 'neutral'}, 'role': 'assistant'}, {'content': {'dialog': 'Yeah !', 'emotion': 'neutral'}, 'role': 'user'}, {'content': {'dialog': '', 'emotion': 'neutral'}, 'role': 'assistant'} ] generation_config = GenerationConfig( max_new_tokens=150, min_new_tokens=5, repetition_penalty=1.1, top_k=3, top_p=0.9, pad_token_id=tokenizer.pad_token_id, eos_token_id=tokenizer.eos_token_id, temperature=1.0, do_sample=True, num_beams=1 ) print(bot(conversation, generation_config=generation_config)[0]['generated_text'][-1]["content"]["dialog"]) ``` **output:** ``` 30 push-ups in a row? ```