Qwen2.5-3B-Instruct with LoRA Adapter
This model is a fine-tuned version of the unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit
using Parameter-Efficient Fine-Tuning (PEFT) with the LoRA method.
Model Details
Model Description
This model applies LoRA (Low-Rank Adaptation) to the Qwen2.5-3B-Instruct base model, targeting key projection modules for efficient fine-tuning. It is quantized to 4-bit precision using the Unsloth library to optimize for inference performance on lower-resource hardware.
- Developed by: montebello.ai
- Funded by: Bootstrapped 4 Life
- Model type: Causal Language Model with LoRA Adapter
- Language(s) (NLP): English (primary), multilingual support for some other languages (research ongoing).
- License: MIT
- Finetuned from model: unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit
Model Sources
- Demo [optional]: [COMING SOON]
Uses
Direct Use
This model is designed for natural language understanding and generation tasks, including:
- Conversational AI
- Summarization
- Text completion
- Question answering
Downstream Use
Fine-tuning the model for specific NLP tasks, such as custom domain conversational agents.
Out-of-Scope Use
- Real-time critical systems requiring guaranteed safety and accuracy.
- Generating content for sensitive domains without human oversight.
Bias, Risks, and Limitations
- The base model may contain biases present in the original training data.
- The model is not fine-tuned for safety-critical applications.
- Limitations include possible hallucinations in generative outputs and language biases.
Recommendations
- Perform task-specific evaluations before deploying the model.
- Include human-in-the-loop for critical applications.
How to Get Started with the Model
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
tokenizer = AutoTokenizer.from_pretrained("unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit")
base_model = AutoModelForCausalLM.from_pretrained("unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "path_to_your_adapter")
# Example inference
input_text = "Your input text here."
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))
Training Details
Training Procedure
- Preprocessing: Tokenization with AutoTokenizer.
- Training regime: bf16 mixed precision with LoRA fine-tuning.
- LoRA Configuration:
lora_alpha
: 64r
: 64target_modules
: ['v_proj', 'o_proj', 'up_proj', 'gate_proj', 'down_proj', 'q_proj', 'k_proj']bias
: none
Training Data
This model was trained on the GSM8K dataset from OpenAI.
Environmental Impact
- Hardware Type: T4 TPU
- Hours used: 2.5 hours
Technical Specifications
Model Architecture and Objective
- Transformer-based causal language model using LoRA for efficient adaptation.
Compute Infrastructure
- Hardware: [Specify GPU/CPU setup]
- Software:
- Python 3.x
- Transformers library
- PEFT 0.14.0
Citation [optional]
BibTeX:
@misc{your2025model,
title={Qwen2.5-3B-Instruct with LoRA Adapter},
author={Kenneth Hamilton},
year={2025}
}
APA:
Your Name. (2025). Qwen2.5-3B-Instruct with LoRA Adapter. Retrieved from Your repository link.
Model Card Contact
- Contact: [Your contact information]
Framework versions
- PEFT 0.14.0
- Downloads last month
- 24