math-stackexchange / README.md

Upload model

ef51b5b verified 11 months ago

5.65 kB

	---
	datasets:
	- blesspearl/stackexchange-math-sample
	language:
	- en
	library_name: transformers
	license: mit
	---

	[Guide](https://medium.com/@rajatsharma_33357/fine-tuning-llama-using-lora-fb3f48a557d5)
	# Fine-Tuned LLaMA 3.1 Model on Stack Exchange Math Dataset

	This repository contains the fine-tuned LLaMA 3.1 model using LoRA on a dataset collected from Stack Exchange Math. The model is designed to answer mathematical questions in a manner similar to Stack Exchange responses.

	## Model Details

	- Base Model: [Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B)
	- Fine-Tuned Model: [math-stackexchange](https://huggingface.co/blesspearl/math-stackexchange)
	- Dataset: [stackexchange-math-sample](https://huggingface.co/datasets/blesspearl/stackexchange-math-sample)
	- Training Environment:
	- Framework: PyTorch with Transformers
	- Platform: Google Colab
	- Hardware: 1 x T4 GPU (15GB)

	## Data Preparation

	The dataset used for fine-tuning includes 1000 samples collected from Stack Exchange Math. Each sample consists of a question and its accepted answer.

	### Preprocessing

	The data was preprocessed using the following steps:
	1. Loading the dataset from Hugging Face.
	2. Shuffling the dataset and selecting 1000 samples.
	3. Formatting the data into a chat template suitable for training.

	## Training Details

	### Libraries and Dependencies

	```python
	from datasets import load_dataset
	from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, TrainingArguments, pipeline, logging
	from peft import LoraConfig, PeftModel, prepare_model_for_kbit_training, get_peft_model
	from google.colab import drive, userdata
	import os, torch, wandb
	from trl import SFTTrainer, setup_chat_format
	from huggingface_hub import login
	```

	### Loading Data and Model

	```python
	model_name = "meta-llama/Meta-Llama-3.1-8B"
	dataset_name = "blesspearl/stackexchange-math-sample"

	torch_dtype = torch.float16
	attn_implementation = "eager"
	wandb.login(key=userdata.get("WANDB_API_KEY"))
	run = wandb.init(
	project='Fine tunning LLama-3.1-8b on math-stack-exchange',
	job_type="training",
	anonymous="allow"
	)

	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_compute_dtype=torch_dtype,
	bnb_4bit_use_double_quant=True,
	)

	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	quantization_config=bnb_config,
	device_map="auto",
	attn_implementation=attn_implementation
	)

	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model, tokenizer = setup_chat_format(model, tokenizer)
	```

	### LoRA Configuration

	```python
	peft_config = LoraConfig(
	r=16,
	lora_alpha=32,
	lora_dropout=0.05,
	bias="none",
	task_type="CAUSAL_LM",
	target_modules=['up_proj', 'down_proj', 'gate_proj', 'k_proj', 'q_proj', 'v_proj', 'o_proj']
	)
	model = get_peft_model(model, peft_config)
	```

	### Data Preparation

	```python
	dataset = load_dataset(dataset_name, split="all")
	dataset = dataset.shuffle(seed=65).select(range(1000))

	def format_chat_template(row):
	row_json = [{"role": "user", "content": row["question_body"]},
	{"role": "assistant", "content": row["accepted_answer"]}]
	row["text"] = tokenizer.apply_chat_template(row_json, tokenize=False)
	return row

	dataset = dataset.map(format_chat_template, num_proc=4)
	dataset = dataset.train_test_split(test_size=0.2)
	dataset = dataset.remove_columns(["question_body", "accepted_answer"])
	```

	### Training Configuration

	```python
	training_arguments = TrainingArguments(
	output_dir="math-stackexchange",
	per_device_train_batch_size=1,
	per_device_eval_batch_size=1,
	gradient_accumulation_steps=2,
	optim="paged_adamw_32bit",
	num_train_epochs=1,
	evaluation_strategy="steps",
	eval_steps=0.2,
	logging_steps=1,
	warmup_steps=10,
	logging_strategy="steps",
	learning_rate=2e-4,
	fp16=False,
	bf16=False,
	group_by_length=True,
	report_to="wandb"
	)

	trainer = SFTTrainer(
	model=model,
	train_dataset=dataset["train"],
	eval_dataset=dataset["test"],
	peft_config=peft_config,
	max_seq_length=512,
	dataset_text_field="text",
	tokenizer=tokenizer,
	args=training_arguments,
	packing=False,
	)
	trainer.train()
	wandb.finish()
	model.config.use_cache = True
	```

	## Model and Dataset

	- Model: [math-stackexchange](https://huggingface.co/blesspearl/math-stackexchange)
	- Dataset: [stackexchange-math-sample](https://huggingface.co/datasets/blesspearl/stackexchange-math-sample)

	## Usage

	To use the fine-tuned model for inference, you can load it using the Hugging Face Transformers library and pass in your data for querying.

	### Example Code

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "blesspearl/math-stackexchange"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name)

	def answer_question(question):
	inputs = tokenizer(question, return_tensors="pt")
	outputs = model.generate(**inputs)
	answer = tokenizer.decode(outputs[0], skip_special_tokens=True)
	return answer

	question = "What is the derivative of sin(x)?"
	answer = answer_question(question)
	print(answer)
	```

	## Conclusion

	This documentation provides an overview of the fine-tuning process of the LLaMA 3.1 model using LoRA on the Stack Exchange Math dataset. The model and dataset are available on Hugging Face for further use and exploration.

	For any questions or issues, feel free to open an issue on the [model repository](https://huggingface.co/blesspearl/math-stackexchange).