devshaheen commited on
Commit
dc5a60a
1 Parent(s): 4a9d5fd

update readme

Browse files
Files changed (1) hide show
  1. README.md +49 -25
README.md CHANGED
@@ -11,24 +11,6 @@ library_name: transformers
11
  finetuned_model: true
12
  model_type: causal-lm
13
  finetuned_task: instruction-following
14
- training_args:
15
- num_train_epochs: 1
16
- batch_size: 4
17
- learning_rate: 2e-4
18
- weight_decay: 0.001
19
- gradient_accumulation_steps: 1
20
- gradient_checkpointing: true
21
- max_grad_norm: 0.3
22
- logging_steps: 25
23
- warmup_ratio: 0.03
24
- optim: paged_adamw_32bit
25
- lr_scheduler_type: cosine
26
- metrics:
27
- - accuracy
28
- - loss
29
- description: |
30
- This is a fine-tuned version of the Llama-2-7B-Chat model, trained on the `mlabonne/guanaco-llama2-1k` dataset for instruction-following tasks. The model has been adapted for text generation, including various NLP tasks such as question answering, summarization, and more.
31
- The fine-tuning process utilizes QLoRa and 4-bit quantization for memory efficiency and better GPU utilization. This model has been optimized for instruction-following and efficient training with gradient accumulation and checkpointing.
32
  tags:
33
  - instruction-following
34
  - text-generation
@@ -39,11 +21,53 @@ tags:
39
  - 4-bit-quantization
40
  - low-memory
41
  - training-optimized
42
- trainer_info: |
43
- The fine-tuned model was trained with a learning rate of 2e-4 using the Paged AdamW optimizer and a warmup ratio of 0.03. The training included gradient accumulation, weight decay, and gradient checkpointing to reduce memory usage. The model was trained for 1 epoch with a batch size of 4 per device. The model was also quantized using the NF4 type to reduce memory usage further. The training was conducted on a 4-bit quantized version of the base model for memory efficiency.
44
- The training script utilizes the `SFTTrainer` class for supervised fine-tuning, with parameters optimized for instruction-following tasks, ensuring robust performance for various text generation tasks.
45
- model_compatibility:
46
- - GPU: Yes (with support for 4-bit and bf16, check compatibility with your hardware)
47
- - Suitable for tasks: Text generation, Question answering, Summarization, and Instruction-based tasks
48
-
49
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  finetuned_model: true
12
  model_type: causal-lm
13
  finetuned_task: instruction-following
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  tags:
15
  - instruction-following
16
  - text-generation
 
21
  - 4-bit-quantization
22
  - low-memory
23
  - training-optimized
24
+ metrics:
25
+ - accuracy
26
+ - loss
 
 
 
 
27
  ---
28
+
29
+ # Llama-2-7B-Chat Fine-Tuned Model
30
+
31
+ This model is a fine-tuned version of **Llama-2-7B-Chat** model, optimized for instruction-following tasks. It has been trained on the `mlabonne/guanaco-llama2-1k` dataset and is optimized for efficient text generation across various NLP tasks, including question answering, summarization, and text completion.
32
+
33
+ ## Model Details
34
+ - **Base Model**: NousResearch/Llama-2-7b-chat-hf
35
+ - **Fine-Tuning Task**: Instruction-following
36
+ - **Training Dataset**: mlabonne/guanaco-llama2-1k
37
+ - **Optimized For**: Text generation, question answering, summarization, and more.
38
+ - **Fine-Tuned Parameters**:
39
+ - **LoRA** (Low-Rank Adaption) applied for efficient training with smaller parameter updates.
40
+ - Quantized to **4-bit** for memory efficiency and better GPU utilization.
41
+ - Training includes **gradient accumulation**, **gradient checkpointing**, and **weight decay** to prevent overfitting and enhance memory efficiency.
42
+
43
+ ## Usage
44
+
45
+ You can use this fine-tuned model with the Hugging Face `transformers` library. Below is an example of how to load and use the model for text generation.
46
+
47
+ ```python
48
+ from transformers import AutoTokenizer, AutoModelForCausalLM
49
+
50
+ # Load pre-trained model and tokenizer
51
+ tokenizer = AutoTokenizer.from_pretrained("YOUR_HUGGINGFACE_USERNAME/llama-2-7b-chat-finetune")
52
+ model = AutoModelForCausalLM.from_pretrained("YOUR_HUGGINGFACE_USERNAME/llama-2-7b-chat-finetune")
53
+
54
+ # Example text generation
55
+ input_text = "What is the capital of France?"
56
+ inputs = tokenizer(input_text, return_tensors="pt")
57
+ outputs = model.generate(**inputs)
58
+ generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
59
+
60
+ print(generated_text)
61
+
62
+
63
+
64
+
65
+
66
+
67
+ @misc{llama-2-7b-chat-finetune,
68
+ author = {Shaheen Nabi},
69
+ title = {Fine-tuned Llama-2-7B-Chat Model},
70
+ year = {2024},
71
+ publisher = {Hugging Face},
72
+ howpublished = {\url{https://huggingface.co/devshaheen/llama-2-7b-chat-finetune}},
73
+ }