Model Card for Qwen-Qwen2.5-7B-Instruct-open-assistant-guanaco-2

Model Details

Model Description

This model is a fine-tuned version of Qwen2.5-7B, optimized for causal language modeling (CAUSAL_LM) using LoRA (Low-Rank Adaptation). The fine-tuning process was carried out under Intel Gaudi access using Habana Gaudi AI processors, leveraging optimum-habana for hardware acceleration.

Developed by: AHAMED-27
Funded by: [More Information Needed]
Shared by: AHAMED-27
Model type: Causal Language Model (CAUSAL_LM)
Language(s): English
License: [More Information Needed]
Finetuned from model: Qwen/Qwen2.5-7B

Model Sources

Repository: AHAMED-27/Qwen-Qwen2.5-7B-Instruct-open-assistant-guanaco-2
Paper: [More Information Needed]
Demo: [More Information Needed]

Uses

Direct Use

This model is designed for natural language generation tasks, such as:

Text completion
Conversational AI
Story generation
Summarization

Downstream Use

The model can be fine-tuned further for specific NLP applications such as:

Chatbots
Code generation
Sentiment analysis
Question answering

Out-of-Scope Use

The model is not intended for real-time decision-making applications where accuracy is critical.
Avoid using it for generating misinformation or harmful content.

Bias, Risks, and Limitations

Known Risks

The model may generate biased or incorrect responses as it is fine-tuned on publicly available datasets.
It may not perform well on low-resource languages or domain-specific tasks without additional fine-tuning.

Recommendations

Users should verify the generated content before deploying it in production.
Ethical considerations should be taken into account while using this model.

How to Get Started with the Model

Use the code below to load and generate text using the model:

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("AHAMED-27/Qwen-Qwen2.5-7B-Instruct-open-assistant-guanaco-2")
model = AutoModelForCausalLM.from_pretrained("AHAMED-27/Qwen-Qwen2.5-7B-Instruct-open-assistant-guanaco-2")

input_text = "Explain the benefits of using LoRA for fine-tuning large language models."
inputs = tokenizer(input_text, return_tensors="pt")
output = model.generate(**inputs)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Training Details

Training Data

The model was fine-tuned on the timdettmers/openassistant-guanaco dataset.

Training Procedure

Preprocessing

Tokenization was performed using the AutoTokenizer from the transformers library.
LoRA adaptation was applied to the attention projection layers (q_proj, v_proj).

Training Hyperparameters

Training Regime: BF16 Mixed Precision
Epochs: 3
Batch Size: 16 per device
Learning Rate: 1e-4
Optimizer: Adam
Scheduler: Constant LR
LoRA Rank (r): 8
LoRA Alpha: 16
LoRA Dropout: 0.05

Speeds, Sizes, Times

Training Runtime: 1026.98 seconds
Training Samples per Second: 17.471
Training Steps per Second: 1.092
Total Available Memory: 94.62 GB
Max Memory Allocated: 89.17 GB
Memory Currently Allocated: 58.34 GB

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model was evaluated on a held-out validation set from the timdettmers/openassistant-guanaco dataset.

Evaluation Metrics

Evaluation Accuracy: 71.51%
Evaluation Loss: 1.3675
Perplexity: 3.92
Evaluation Runtime: 20.308 seconds
Evaluation Samples per Second: 22.511
Evaluation Steps per Second: 2.882

Software Dependencies

Transformers Version: 4.38.2
Optimum-Habana Version: 1.24.0
Intel Gaudi SynapseAI Toolkit

Acknowledgments

This fine-tuning process was completed using Intel Gaudi hardware, enabling optimized performance with reduced training time. Special thanks to the Intel Habana team for their work on Gaudi AI processors.

For more details, visit Habana Labs.

AHAMED-27
/

Qwen-Qwen2.5-7B-Instruct-open-assistant-guanaco-2