devshaheen
/

Llama-2-7b-chat-finetune

@@ -11,24 +11,6 @@ library_name: transformers
 finetuned_model: true
 model_type: causal-lm
 finetuned_task: instruction-following
-training_args:
-  num_train_epochs: 1
-  batch_size: 4
-  learning_rate: 2e-4
-  weight_decay: 0.001
-  gradient_accumulation_steps: 1
-  gradient_checkpointing: true
-  max_grad_norm: 0.3
-  logging_steps: 25
-  warmup_ratio: 0.03
-  optim: paged_adamw_32bit
-  lr_scheduler_type: cosine
-metrics:
-  - accuracy
-  - loss
-description: |
-  This is a fine-tuned version of the Llama-2-7B-Chat model, trained on the `mlabonne/guanaco-llama2-1k` dataset for instruction-following tasks. The model has been adapted for text generation, including various NLP tasks such as question answering, summarization, and more.
-  The fine-tuning process utilizes QLoRa and 4-bit quantization for memory efficiency and better GPU utilization. This model has been optimized for instruction-following and efficient training with gradient accumulation and checkpointing.
 tags:
   - instruction-following
   - text-generation
@@ -39,11 +21,53 @@ tags:
   - 4-bit-quantization
   - low-memory
   - training-optimized
-trainer_info: |
-  The fine-tuned model was trained with a learning rate of 2e-4 using the Paged AdamW optimizer and a warmup ratio of 0.03. The training included gradient accumulation, weight decay, and gradient checkpointing to reduce memory usage. The model was trained for 1 epoch with a batch size of 4 per device. The model was also quantized using the NF4 type to reduce memory usage further. The training was conducted on a 4-bit quantized version of the base model for memory efficiency.
-  The training script utilizes the `SFTTrainer` class for supervised fine-tuning, with parameters optimized for instruction-following tasks, ensuring robust performance for various text generation tasks.
-model_compatibility:
-  - GPU: Yes (with support for 4-bit and bf16, check compatibility with your hardware)
-  - Suitable for tasks: Text generation, Question answering, Summarization, and Instruction-based tasks
 ---

 finetuned_model: true
 model_type: causal-lm
 finetuned_task: instruction-following
 tags:
   - instruction-following
   - text-generation
   - 4-bit-quantization
   - low-memory
   - training-optimized
+metrics:
+  - accuracy
+  - loss
 ---
+# Llama-2-7B-Chat Fine-Tuned Model
+This model is a fine-tuned version of **Llama-2-7B-Chat** model, optimized for instruction-following tasks. It has been trained on the `mlabonne/guanaco-llama2-1k` dataset and is optimized for efficient text generation across various NLP tasks, including question answering, summarization, and text completion.
+## Model Details
+- **Base Model**: NousResearch/Llama-2-7b-chat-hf
+- **Fine-Tuning Task**: Instruction-following
+- **Training Dataset**: mlabonne/guanaco-llama2-1k
+- **Optimized For**: Text generation, question answering, summarization, and more.
+- **Fine-Tuned Parameters**:
+  - **LoRA** (Low-Rank Adaption) applied for efficient training with smaller parameter updates.
+  - Quantized to **4-bit** for memory efficiency and better GPU utilization.
+  - Training includes **gradient accumulation**, **gradient checkpointing**, and **weight decay** to prevent overfitting and enhance memory efficiency.
+## Usage
+You can use this fine-tuned model with the Hugging Face `transformers` library. Below is an example of how to load and use the model for text generation.
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+# Load pre-trained model and tokenizer
+tokenizer = AutoTokenizer.from_pretrained("YOUR_HUGGINGFACE_USERNAME/llama-2-7b-chat-finetune")
+model = AutoModelForCausalLM.from_pretrained("YOUR_HUGGINGFACE_USERNAME/llama-2-7b-chat-finetune")
+# Example text generation
+input_text = "What is the capital of France?"
+inputs = tokenizer(input_text, return_tensors="pt")
+outputs = model.generate(**inputs)
+generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
+print(generated_text)
+@misc{llama-2-7b-chat-finetune,
+  author = {Shaheen Nabi},
+  title = {Fine-tuned Llama-2-7B-Chat Model},
+  year = {2024},
+  publisher = {Hugging Face},
+  howpublished = {\url{https://huggingface.co/devshaheen/llama-2-7b-chat-finetune}},
+}