|
--- |
|
license: mit |
|
datasets: |
|
- mlabonne/guanaco-llama2-1k |
|
language: |
|
- en |
|
base_model: |
|
- NousResearch/Llama-2-7b-chat-hf |
|
pipeline_tag: text-generation |
|
library_name: transformers |
|
finetuned_model: true |
|
model_type: causal-lm |
|
finetuned_task: instruction-following |
|
tags: |
|
- instruction-following |
|
- text-generation |
|
- fine-tuned |
|
- llama2 |
|
- causal-language-model |
|
- QLoRa |
|
- 4-bit-quantization |
|
- low-memory |
|
- training-optimized |
|
metrics: |
|
- accuracy |
|
- loss |
|
--- |
|
|
|
# Llama-2-7B-Chat Fine-Tuned Model |
|
|
|
This model is a fine-tuned version of **Llama-2-7B-Chat** model, optimized for instruction-following tasks. It has been trained on the `mlabonne/guanaco-llama2-1k` dataset and is optimized for efficient text generation across various NLP tasks, including question answering, summarization, and text completion. |
|
|
|
## Model Details |
|
- **Base Model**: NousResearch/Llama-2-7b-chat-hf |
|
- **Fine-Tuning Task**: Instruction-following |
|
- **Training Dataset**: mlabonne/guanaco-llama2-1k |
|
- **Optimized For**: Text generation, question answering, summarization, and more. |
|
- **Fine-Tuned Parameters**: |
|
- **LoRA** (Low-Rank Adaption) applied for efficient training with smaller parameter updates. |
|
- Quantized to **4-bit** for memory efficiency and better GPU utilization. |
|
- Training includes **gradient accumulation**, **gradient checkpointing**, and **weight decay** to prevent overfitting and enhance memory efficiency. |
|
|
|
## Usage |
|
|
|
You can use this fine-tuned model with the Hugging Face `transformers` library. Below is an example of how to load and use the model for text generation. |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
# Load pre-trained model and tokenizer |
|
tokenizer = AutoTokenizer.from_pretrained("YOUR_HUGGINGFACE_USERNAME/llama-2-7b-chat-finetune") |
|
model = AutoModelForCausalLM.from_pretrained("YOUR_HUGGINGFACE_USERNAME/llama-2-7b-chat-finetune") |
|
|
|
# Example text generation |
|
input_text = "What is the capital of France?" |
|
inputs = tokenizer(input_text, return_tensors="pt") |
|
outputs = model.generate(**inputs) |
|
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
|
print(generated_text) |
|
|
|
|
|
|
|
|
|
|
|
|
|
@misc{llama-2-7b-chat-finetune, |
|
author = {Shaheen Nabi}, |
|
title = {Fine-tuned Llama-2-7B-Chat Model}, |
|
year = {2024}, |
|
publisher = {Hugging Face}, |
|
howpublished = {\url{https://huggingface.co/devshaheen/llama-2-7b-chat-finetune}}, |
|
} |
|
|