--- license: mit datasets: - mlabonne/guanaco-llama2-1k language: - en base_model: - NousResearch/Llama-2-7b-chat-hf pipeline_tag: text-generation library_name: transformers finetuned_model: true model_type: causal-lm finetuned_task: instruction-following tags: - instruction-following - text-generation - fine-tuned - llama2 - causal-language-model - QLoRa - 4-bit-quantization - low-memory - training-optimized metrics: - accuracy - loss --- # Llama-2-7B-Chat Fine-Tuned Model This model is a fine-tuned version of **Llama-2-7B-Chat** model, optimized for instruction-following tasks. It has been trained on the `mlabonne/guanaco-llama2-1k` dataset and is optimized for efficient text generation across various NLP tasks, including question answering, summarization, and text completion. ## Model Details - **Base Model**: NousResearch/Llama-2-7b-chat-hf - **Fine-Tuning Task**: Instruction-following - **Training Dataset**: mlabonne/guanaco-llama2-1k - **Optimized For**: Text generation, question answering, summarization, and more. - **Fine-Tuned Parameters**: - **LoRA** (Low-Rank Adaption) applied for efficient training with smaller parameter updates. - Quantized to **4-bit** for memory efficiency and better GPU utilization. - Training includes **gradient accumulation**, **gradient checkpointing**, and **weight decay** to prevent overfitting and enhance memory efficiency. ## Usage You can use this fine-tuned model with the Hugging Face `transformers` library. Below is an example of how to load and use the model for text generation. ```python from transformers import AutoTokenizer, AutoModelForCausalLM # Load pre-trained model and tokenizer tokenizer = AutoTokenizer.from_pretrained("https://huggingface.co/devshaheen/llama-2-7b-chat-finetune") model = AutoModelForCausalLM.from_pretrained("https://huggingface.co/devshaheen/llama-2-7b-chat-finetune") # Example text generation input_text = "What is the capital of France?" inputs = tokenizer(input_text, return_tensors="pt") outputs = model.generate(**inputs) generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True) print(generated_text)