monsterapi
/

gemma-2b-lora-maths-orca-200k

Model card Files Files and versions Community

gemma-2b-lora-maths-orca-200k / README.md

souvik0306's picture

Update README.md

57d8ada verified 10 months ago

|

history blame contribute delete

2.1 kB

	---
	library_name: peft
	tags:
	- math
	- google
	- gemma
	datasets:
	- microsoft/orca-math-word-problems-200k
	base_model: google/gemma-2b
	license: apache-2.0
	---

	### Finetuning Overview:

	Model Used: google/gemma-2b

	Dataset: microsoft/orca-math-word-problems-200k

	#### Dataset Insights:

	[Math Ocra](https://huggingface.co/datasets/microsoft/orca-math-word-problems-200k) - This dataset contains ~200K grade school math word problems. All the answers in this dataset is generated using Azure GPT4-Turbo. Please refer to [Orca-Math: Unlocking the potential of SLMs in Grade School Math](https://arxiv.org/pdf/2402.14830.pdf) for details about the dataset construction.

	#### Finetuning Details:

	With the utilization of [MonsterAPI](https://monsterapi.ai)'s [no-code LLM finetuner](https://monsterapi.ai/finetuning), this finetuning:

	- A remarkable 68% boost in performance over the base model.
	- Completed in a total duration of 2d 7h 45m for 10 epochs using an A6000 48GB GPU.
	- Demonstrated cost-effectiveness, with a single epoch costing only $11.3.

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/63ba46aa0a9866b28cb19a14/puTKYn6MPlVzjfcwTAFXQ.png)

	#### Hyperparameters & Additional Details:

	- Epochs: 10
	- Total Finetuning Cost: $113
	- Model Path: google/gemma-2b
	- Learning Rate: 0.0001
	- Gradient Accumulation Steps: 32
	- lora_alpha: 128
	- lora_r: 64

	#### Benchmarking Performance Details:
	### Finetuned Gemma-2B using MonsterAPI achieved a remarkable score of 20.02 on the GSM Plus benchmark.
	- This represents a 68% improvement over its base model performance.
	- Notably, it outperformed larger models like LLaMA-2-13B and Code-LLaMA-7B
	This result suggests that targeted fine-tuning can significantly improve model performance.

	### Read the Detailed Case Study over [here](https://blog.monsterapi.ai/finetuned-gemma-2b-on-monsterapi-outperforms-llama-13b/)

	![Benchmarking Performance](https://cdn-uploads.huggingface.co/production/uploads/63ba46aa0a9866b28cb19a14/ZpLtZm-32Y0W4LwW5LptZ.png)

	---
	license: apache-2.0