Uploaded model: Llama 3.1 8B Finetuned

Developed by: kparkhade
License: apache-2.0
Base model : unsloth/Meta-Llama-3.1-8B-bnb-4bit

Overview

This fine-tuned Llama 3.1 8B model was optimized for efficient text generation tasks. By leveraging advanced optimization techniques from Unsloth and Hugging Face's TRL library, training was completed 2x faster than conventional methods.

Key Features

Speed Optimized: Training was accelerated with the Unsloth framework, significantly reducing resource consumption.
Model Compatibility: Compatible with Hugging Face's ecosystem for seamless integration.
Quantization: Built on a 4-bit quantized base model for efficient deployment and inference.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the model and tokenizer
model_name = "kparkhade/Llama-3.1-8B"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Generate text
inputs = tokenizer("Your input prompt here", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Applications

This model can be used for:

Creative writing (e.g., story or poetry generation)
Generating conversational responses
Assisting with coding-related queries

Acknowledgements

Special thanks to the Unsloth team for providing tools that make model fine-tuning faster and more efficient.