devshaheen
/

llama-3.2-3b-Instruct-finetune

Text Generation

text-generation-inference

instruction-tuning

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

llama-3.2-3b-Instruct-finetune / README.md

devshaheen's picture

updated readme

ff0c9e9 verified about 1 month ago

|

history blame contribute delete

3.33 kB

	---
	base_model: unsloth/llama-3.2-3b-instruct-bnb-4bit
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- llama
	- trl
	- multilingual
	- instruction-tuning
	license: apache-2.0
	language:
	- en
	- kn
	datasets:
	- charanhu/kannada-instruct-dataset-390k
	library_name: transformers
	---

	# Uploaded Model: devshaheen/llama-3.2-3b-Instruct-finetune

	## Overview

	- Developed by: devshaheen
	- License: Apache-2.0
	- Finetuned from model: `unsloth/llama-3.2-3b-instruct-bnb-4bit`
	- Languages Supported:
	- English (`en`) for general-purpose text generation and instruction-following tasks.
	- Kannada (`kn`) with a focus on localized and culturally aware text generation.
	- Dataset Used: [charanhu/kannada-instruct-dataset-390k](https://huggingface.co/datasets/charanhu/kannada-instruct-dataset-390k)

	This model is a fine-tuned version of LLaMA, optimized for multilingual instruction-following tasks with a specific emphasis on English and Kannada. It utilizes 4-bit quantization for efficient deployment in low-resource environments without compromising performance.

	---

	## Features

	### 1. Instruction Tuning
	The model is trained to follow a wide range of instructions and generate contextually relevant responses. It excels in both creative and factual text generation tasks.

	### 2. Multilingual Support
	The model is capable of generating text in Kannada and English, making it suitable for users requiring bilingual capabilities.

	### 3. Optimized Training
	Training was accelerated using [Unsloth](https://github.com/unslothai/unsloth), achieving 2x faster training compared to conventional methods. This was complemented by HuggingFace's TRL (Transformers Reinforcement Learning) library to ensure high performance.

	### 4. Efficiency through Quantization
	Built on the `bnb-4bit` quantized model, it is designed for optimal performance in environments with limited computational resources while maintaining precision and depth in output.

	---

	## Usage Scenarios

	### General Use
	- Text completion and creative writing.
	- Generating instructions or following queries in English and Kannada.

	### Specialized Applications
	- Localized AI systems in Kannada for chatbots, educational tools, and more.
	- Research and development on multilingual instruction-tuned models.

	---

	## Performance and Metrics

	### Evaluation Dataset:
	The model was fine-tuned on [charanhu/kannada-instruct-dataset-390k](https://huggingface.co/datasets/charanhu/kannada-instruct-dataset-390k), a comprehensive dataset designed for Kannada instruction tuning.

	### Training Parameters:
	- Base Model: LLaMA 3.2-3B-Instruct
	- Optimizer: AdamW
	- Quantization: 4-bit (bnb-4bit)
	- Framework: HuggingFace Transformers + Unsloth

	---

	## Example Usage

	### Python Code
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	# Load model and tokenizer
	model_name = "devshaheen/llama-3.2-3b-Instruct-finetune"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name)

	# Generate text
	input_text = "How does climate change affect the monsoon in Karnataka?"
	inputs = tokenizer(input_text, return_tensors="pt")
	outputs = model.generate(**inputs, max_length=150)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))