metadata
library_name: peft
tags:
- math
- google
- gemma
datasets:
- microsoft/orca-math-word-problems-200k
base_model: google/gemma-2b
license: apache-2.0
Finetuning Overview:
Model Used: google/gemma-2b
Dataset: microsoft/orca-math-word-problems-200k
Dataset Insights:
Math Ocra - This dataset contains ~200K grade school math word problems. All the answers in this dataset is generated using Azure GPT4-Turbo. Please refer to Orca-Math: Unlocking the potential of SLMs in Grade School Math for details about the dataset construction.
Finetuning Details:
With the utilization of MonsterAPI's no-code LLM finetuner, this finetuning:
- A remarkable 68% boost in performance over the base model.
- Completed in a total duration of 2d 7h 45m for 10 epochs using an A6000 48GB GPU.
- Demonstrated cost-effectiveness, with a single epoch costing only $11.3.
Hyperparameters & Additional Details:
- Epochs: 10
- Total Finetuning Cost: $113
- Model Path: google/gemma-2b
- Learning Rate: 0.0001
- Gradient Accumulation Steps: 32
- lora_alpha: 128
- lora_r: 64
Benchmarking Performance Details:
Finetuned Gemma-2B using MonsterAPI achieved a remarkable score of 20.02 on the GSM Plus benchmark.
- This represents a 68% improvement over its base model performance.
- Notably, it outperformed larger models like LLaMA-2-13B and Code-LLaMA-7B This result suggests that targeted fine-tuning can significantly improve model performance.
Read the Detailed Case Study over here
license: apache-2.0