devshaheen
commited on
Commit
•
dc5a60a
1
Parent(s):
4a9d5fd
update readme
Browse files
README.md
CHANGED
@@ -11,24 +11,6 @@ library_name: transformers
|
|
11 |
finetuned_model: true
|
12 |
model_type: causal-lm
|
13 |
finetuned_task: instruction-following
|
14 |
-
training_args:
|
15 |
-
num_train_epochs: 1
|
16 |
-
batch_size: 4
|
17 |
-
learning_rate: 2e-4
|
18 |
-
weight_decay: 0.001
|
19 |
-
gradient_accumulation_steps: 1
|
20 |
-
gradient_checkpointing: true
|
21 |
-
max_grad_norm: 0.3
|
22 |
-
logging_steps: 25
|
23 |
-
warmup_ratio: 0.03
|
24 |
-
optim: paged_adamw_32bit
|
25 |
-
lr_scheduler_type: cosine
|
26 |
-
metrics:
|
27 |
-
- accuracy
|
28 |
-
- loss
|
29 |
-
description: |
|
30 |
-
This is a fine-tuned version of the Llama-2-7B-Chat model, trained on the `mlabonne/guanaco-llama2-1k` dataset for instruction-following tasks. The model has been adapted for text generation, including various NLP tasks such as question answering, summarization, and more.
|
31 |
-
The fine-tuning process utilizes QLoRa and 4-bit quantization for memory efficiency and better GPU utilization. This model has been optimized for instruction-following and efficient training with gradient accumulation and checkpointing.
|
32 |
tags:
|
33 |
- instruction-following
|
34 |
- text-generation
|
@@ -39,11 +21,53 @@ tags:
|
|
39 |
- 4-bit-quantization
|
40 |
- low-memory
|
41 |
- training-optimized
|
42 |
-
|
43 |
-
|
44 |
-
|
45 |
-
model_compatibility:
|
46 |
-
- GPU: Yes (with support for 4-bit and bf16, check compatibility with your hardware)
|
47 |
-
- Suitable for tasks: Text generation, Question answering, Summarization, and Instruction-based tasks
|
48 |
-
|
49 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
finetuned_model: true
|
12 |
model_type: causal-lm
|
13 |
finetuned_task: instruction-following
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
tags:
|
15 |
- instruction-following
|
16 |
- text-generation
|
|
|
21 |
- 4-bit-quantization
|
22 |
- low-memory
|
23 |
- training-optimized
|
24 |
+
metrics:
|
25 |
+
- accuracy
|
26 |
+
- loss
|
|
|
|
|
|
|
|
|
27 |
---
|
28 |
+
|
29 |
+
# Llama-2-7B-Chat Fine-Tuned Model
|
30 |
+
|
31 |
+
This model is a fine-tuned version of **Llama-2-7B-Chat** model, optimized for instruction-following tasks. It has been trained on the `mlabonne/guanaco-llama2-1k` dataset and is optimized for efficient text generation across various NLP tasks, including question answering, summarization, and text completion.
|
32 |
+
|
33 |
+
## Model Details
|
34 |
+
- **Base Model**: NousResearch/Llama-2-7b-chat-hf
|
35 |
+
- **Fine-Tuning Task**: Instruction-following
|
36 |
+
- **Training Dataset**: mlabonne/guanaco-llama2-1k
|
37 |
+
- **Optimized For**: Text generation, question answering, summarization, and more.
|
38 |
+
- **Fine-Tuned Parameters**:
|
39 |
+
- **LoRA** (Low-Rank Adaption) applied for efficient training with smaller parameter updates.
|
40 |
+
- Quantized to **4-bit** for memory efficiency and better GPU utilization.
|
41 |
+
- Training includes **gradient accumulation**, **gradient checkpointing**, and **weight decay** to prevent overfitting and enhance memory efficiency.
|
42 |
+
|
43 |
+
## Usage
|
44 |
+
|
45 |
+
You can use this fine-tuned model with the Hugging Face `transformers` library. Below is an example of how to load and use the model for text generation.
|
46 |
+
|
47 |
+
```python
|
48 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
49 |
+
|
50 |
+
# Load pre-trained model and tokenizer
|
51 |
+
tokenizer = AutoTokenizer.from_pretrained("YOUR_HUGGINGFACE_USERNAME/llama-2-7b-chat-finetune")
|
52 |
+
model = AutoModelForCausalLM.from_pretrained("YOUR_HUGGINGFACE_USERNAME/llama-2-7b-chat-finetune")
|
53 |
+
|
54 |
+
# Example text generation
|
55 |
+
input_text = "What is the capital of France?"
|
56 |
+
inputs = tokenizer(input_text, return_tensors="pt")
|
57 |
+
outputs = model.generate(**inputs)
|
58 |
+
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
59 |
+
|
60 |
+
print(generated_text)
|
61 |
+
|
62 |
+
|
63 |
+
|
64 |
+
|
65 |
+
|
66 |
+
|
67 |
+
@misc{llama-2-7b-chat-finetune,
|
68 |
+
author = {Shaheen Nabi},
|
69 |
+
title = {Fine-tuned Llama-2-7B-Chat Model},
|
70 |
+
year = {2024},
|
71 |
+
publisher = {Hugging Face},
|
72 |
+
howpublished = {\url{https://huggingface.co/devshaheen/llama-2-7b-chat-finetune}},
|
73 |
+
}
|