Model Card for VAGOsolutions-Llama-3-SauerkrautLM-8b-Instruct-openassistant-guanaco

Model Details

Model Description

This model is a fine-tuned version of Llama-3-SauerkrautLM-8b, optimized for causal language modeling (CAUSAL_LM) using LoRA (Low-Rank Adaptation). The fine-tuning process was carried out under Intel Gaudi access using Habana Gaudi AI processors, leveraging optimum-habana for hardware acceleration.

  • Developed by: AHAMED-27
  • Funded by: [More Information Needed]
  • Shared by: AHAMED-27
  • Model type: Causal Language Model (CAUSAL_LM)
  • Language(s): English
  • License: [More Information Needed]
  • Finetuned from model: VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct

Model Sources

Uses

Direct Use

This model is designed for natural language generation tasks, such as:

  • Text completion
  • Conversational AI
  • Story generation
  • Summarization

Downstream Use

The model can be fine-tuned further for specific NLP applications such as:

  • Chatbots
  • Code generation
  • Sentiment analysis
  • Question answering

Out-of-Scope Use

  • The model is not intended for real-time decision-making applications where accuracy is critical.
  • Avoid using it for generating misinformation or harmful content.

Bias, Risks, and Limitations

Known Risks

  • The model may generate biased or incorrect responses as it is fine-tuned on publicly available datasets.
  • It may not perform well on low-resource languages or domain-specific tasks without additional fine-tuning.

Recommendations

  • Users should verify the generated content before deploying it in production.
  • Ethical considerations should be taken into account while using this model.

How to Get Started with the Model

Use the code below to load and generate text using the model:

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("AHAMED-27/VAGOsolutions-Llama-3-SauerkrautLM-8b-Instruct-openassistant-guanaco")
model = AutoModelForCausalLM.from_pretrained("AHAMED-27/VAGOsolutions-Llama-3-SauerkrautLM-8b-Instruct-openassistant-guanaco")

input_text = "Explain the benefits of using LoRA for fine-tuning large language models."
inputs = tokenizer(input_text, return_tensors="pt")
output = model.generate(**inputs)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Training Details

Training Data

The model was fine-tuned on the openassistant-guanaco dataset.

Training Procedure

Preprocessing

  • Tokenization was performed using the AutoTokenizer from the transformers library.
  • LoRA adaptation was applied to the attention projection layers (q_proj, v_proj).

Training Hyperparameters

  • Training Regime: BF16 Mixed Precision
  • Epochs: 3
  • Batch Size: 16 per device
  • Learning Rate: 1e-4
  • Optimizer: Adam
  • Scheduler: Constant LR
  • LoRA Rank (r): 8
  • LoRA Alpha: 16
  • LoRA Dropout: 0.05

Speeds, Sizes, Times

  • Training Runtime: 1086.32 seconds
  • Training Samples per Second: 16.197
  • Training Steps per Second: 1.015
  • Total Available Memory: 94.62 GB
  • Max Memory Allocated: 92.66 GB
  • Memory Currently Allocated: 67.67 GB

Evaluation

Testing Data, Factors & Metrics

Testing Data

  • The model was evaluated on a held-out validation set from the openassistant-guanaco dataset.

Evaluation Metrics

  • Evaluation Accuracy: 70.18%
  • Evaluation Loss: 1.4535
  • Perplexity: 4.28
  • Evaluation Runtime: 9.54 seconds
  • Evaluation Samples per Second: 35.85
  • Evaluation Steps per Second: 4.513

Software Dependencies

  • Transformers Version: 4.38.2
  • Optimum-Habana Version: 1.24.0
  • Intel Gaudi SynapseAI Toolkit

Acknowledgments

This fine-tuning process was completed using Intel Gaudi hardware, enabling optimized performance with reduced training time. Special thanks to the Intel Habana team for their work on Gaudi AI processors.

For more details, visit Habana Labs.

Downloads last month
1
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for AHAMED-27/VAGOsolutions-Llama-3-SauerkrautLM-8b-Instruct-openassistant-guanaco

Adapter
(3)
this model