devshaheen
/

Llama-2-7b-chat-finetune

Text Generation

instruction-following

causal-language-model

4-bit-quantization

training-optimized

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

devshaheen commited on 7 days ago

Commit

6ab9caa

•

1 Parent(s): bafd56b

Create README.md

Files changed (1) hide show

README.md +48 -0

README.md ADDED Viewed

	@@ -0,0 +1,48 @@

+---
+license: mit
+datasets:
+  - mlabonne/guanaco-llama2-1k
+language:
+  - en
+base_model:
+  - NousResearch/Llama-2-7b-chat-hf
+pipeline_tag: text-generation
+library_name: transformers
+finetuned_model: true
+model_type: causal-lm
+finetuned_task: instruction-following
+training_args:
+  num_train_epochs: 1
+  batch_size: 4
+  learning_rate: 2e-4
+  weight_decay: 0.001
+  gradient_accumulation_steps: 1
+  gradient_checkpointing: true
+  max_grad_norm: 0.3
+  logging_steps: 25
+  warmup_ratio: 0.03
+  optim: paged_adamw_32bit
+  lr_scheduler_type: cosine
+metrics:
+  - accuracy
+  - loss
+description: |
+  This is a fine-tuned version of the Llama-2-7B-Chat model, trained on the `mlabonne/guanaco-llama2-1k` dataset for instruction-following tasks. The model has been adapted for text generation, including various NLP tasks such as question answering, summarization, and more.
+  The fine-tuning process utilizes QLoRa and 4-bit quantization for memory efficiency and better GPU utilization. This model has been optimized for instruction-following and efficient training with gradient accumulation and checkpointing.
+tags:
+  - instruction-following
+  - text-generation
+  - fine-tuned
+  - llama2
+  - causal-language-model
+  - QLoRa
+  - 4-bit-quantization
+  - low-memory
+  - training-optimized
+trainer_info: |
+  The fine-tuned model was trained with a learning rate of 2e-4 using the Paged AdamW optimizer and a warmup ratio of 0.03. The training included gradient accumulation, weight decay, and gradient checkpointing to reduce memory usage. The model was trained for 1 epoch with a batch size of 4 per device. The model was also quantized using the NF4 type to reduce memory usage further. The training was conducted on a 4-bit quantized version of the base model for memory efficiency.
+  The training script utilizes the `SFTTrainer` class for supervised fine-tuning, with parameters optimized for instruction-following tasks, ensuring robust performance for various text generation tasks.
+model_compatibility:
+  - GPU: Yes (with support for 4-bit and bf16, check compatibility with your hardware)
+  - Suitable for tasks: Text generation, Question answering, Summarization, and Instruction-based tasks
+---