|
--- |
|
datasets: |
|
- blesspearl/stackexchange-math-sample |
|
language: |
|
- en |
|
library_name: transformers |
|
license: mit |
|
--- |
|
|
|
[Guide](https://medium.com/@rajatsharma_33357/fine-tuning-llama-using-lora-fb3f48a557d5) |
|
# Fine-Tuned LLaMA 3.1 Model on Stack Exchange Math Dataset |
|
|
|
This repository contains the fine-tuned LLaMA 3.1 model using LoRA on a dataset collected from Stack Exchange Math. The model is designed to answer mathematical questions in a manner similar to Stack Exchange responses. |
|
|
|
## Model Details |
|
|
|
- **Base Model:** [Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B) |
|
- **Fine-Tuned Model:** [math-stackexchange](https://huggingface.co/blesspearl/math-stackexchange) |
|
- **Dataset:** [stackexchange-math-sample](https://huggingface.co/datasets/blesspearl/stackexchange-math-sample) |
|
- **Training Environment:** |
|
- Framework: PyTorch with Transformers |
|
- Platform: Google Colab |
|
- Hardware: 1 x T4 GPU (15GB) |
|
|
|
## Data Preparation |
|
|
|
The dataset used for fine-tuning includes 1000 samples collected from Stack Exchange Math. Each sample consists of a question and its accepted answer. |
|
|
|
### Preprocessing |
|
|
|
The data was preprocessed using the following steps: |
|
1. Loading the dataset from Hugging Face. |
|
2. Shuffling the dataset and selecting 1000 samples. |
|
3. Formatting the data into a chat template suitable for training. |
|
|
|
## Training Details |
|
|
|
### Libraries and Dependencies |
|
|
|
```python |
|
from datasets import load_dataset |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, TrainingArguments, pipeline, logging |
|
from peft import LoraConfig, PeftModel, prepare_model_for_kbit_training, get_peft_model |
|
from google.colab import drive, userdata |
|
import os, torch, wandb |
|
from trl import SFTTrainer, setup_chat_format |
|
from huggingface_hub import login |
|
``` |
|
|
|
### Loading Data and Model |
|
|
|
```python |
|
model_name = "meta-llama/Meta-Llama-3.1-8B" |
|
dataset_name = "blesspearl/stackexchange-math-sample" |
|
|
|
torch_dtype = torch.float16 |
|
attn_implementation = "eager" |
|
wandb.login(key=userdata.get("WANDB_API_KEY")) |
|
run = wandb.init( |
|
project='Fine tunning LLama-3.1-8b on math-stack-exchange', |
|
job_type="training", |
|
anonymous="allow" |
|
) |
|
|
|
bnb_config = BitsAndBytesConfig( |
|
load_in_4bit=True, |
|
bnb_4bit_quant_type="nf4", |
|
bnb_4bit_compute_dtype=torch_dtype, |
|
bnb_4bit_use_double_quant=True, |
|
) |
|
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_name, |
|
quantization_config=bnb_config, |
|
device_map="auto", |
|
attn_implementation=attn_implementation |
|
) |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model, tokenizer = setup_chat_format(model, tokenizer) |
|
``` |
|
|
|
### LoRA Configuration |
|
|
|
```python |
|
peft_config = LoraConfig( |
|
r=16, |
|
lora_alpha=32, |
|
lora_dropout=0.05, |
|
bias="none", |
|
task_type="CAUSAL_LM", |
|
target_modules=['up_proj', 'down_proj', 'gate_proj', 'k_proj', 'q_proj', 'v_proj', 'o_proj'] |
|
) |
|
model = get_peft_model(model, peft_config) |
|
``` |
|
|
|
### Data Preparation |
|
|
|
```python |
|
dataset = load_dataset(dataset_name, split="all") |
|
dataset = dataset.shuffle(seed=65).select(range(1000)) |
|
|
|
def format_chat_template(row): |
|
row_json = [{"role": "user", "content": row["question_body"]}, |
|
{"role": "assistant", "content": row["accepted_answer"]}] |
|
row["text"] = tokenizer.apply_chat_template(row_json, tokenize=False) |
|
return row |
|
|
|
dataset = dataset.map(format_chat_template, num_proc=4) |
|
dataset = dataset.train_test_split(test_size=0.2) |
|
dataset = dataset.remove_columns(["question_body", "accepted_answer"]) |
|
``` |
|
|
|
### Training Configuration |
|
|
|
```python |
|
training_arguments = TrainingArguments( |
|
output_dir="math-stackexchange", |
|
per_device_train_batch_size=1, |
|
per_device_eval_batch_size=1, |
|
gradient_accumulation_steps=2, |
|
optim="paged_adamw_32bit", |
|
num_train_epochs=1, |
|
evaluation_strategy="steps", |
|
eval_steps=0.2, |
|
logging_steps=1, |
|
warmup_steps=10, |
|
logging_strategy="steps", |
|
learning_rate=2e-4, |
|
fp16=False, |
|
bf16=False, |
|
group_by_length=True, |
|
report_to="wandb" |
|
) |
|
|
|
trainer = SFTTrainer( |
|
model=model, |
|
train_dataset=dataset["train"], |
|
eval_dataset=dataset["test"], |
|
peft_config=peft_config, |
|
max_seq_length=512, |
|
dataset_text_field="text", |
|
tokenizer=tokenizer, |
|
args=training_arguments, |
|
packing=False, |
|
) |
|
trainer.train() |
|
wandb.finish() |
|
model.config.use_cache = True |
|
``` |
|
|
|
## Model and Dataset |
|
|
|
- **Model:** [math-stackexchange](https://huggingface.co/blesspearl/math-stackexchange) |
|
- **Dataset:** [stackexchange-math-sample](https://huggingface.co/datasets/blesspearl/stackexchange-math-sample) |
|
|
|
## Usage |
|
|
|
To use the fine-tuned model for inference, you can load it using the Hugging Face Transformers library and pass in your data for querying. |
|
|
|
### Example Code |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model_name = "blesspearl/math-stackexchange" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained(model_name) |
|
|
|
def answer_question(question): |
|
inputs = tokenizer(question, return_tensors="pt") |
|
outputs = model.generate(**inputs) |
|
answer = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
return answer |
|
|
|
question = "What is the derivative of sin(x)?" |
|
answer = answer_question(question) |
|
print(answer) |
|
``` |
|
|
|
## Conclusion |
|
|
|
This documentation provides an overview of the fine-tuning process of the LLaMA 3.1 model using LoRA on the Stack Exchange Math dataset. The model and dataset are available on Hugging Face for further use and exploration. |
|
|
|
For any questions or issues, feel free to open an issue on the [model repository](https://huggingface.co/blesspearl/math-stackexchange). |
|
|