metadata
license: other
language:
- en
Model Details
This is an unofficial implementation of "AlpaGasus: Training a better Alpaca with Fewer Data." with LLaMA2 & QLoRA! Training code is available at our repo.
- Developed by: Yunsang Yoo and Hyunwoo Ko
- Model type: Auto-regressive model
- Language(s): English
- Base Model: meta-llama/Llama-2-13b-hf
- License: Non-Commercial Creative Commons license (CC BY-NC-4.0)
Training Dataset
"StudentLLM/Alpagasus-2-13b-QLoRA-merged" used gpt4life's gpt-3.5-turbo filtered dataset, 'alpaca_t45.json'. Configuration of the dataset is as follows:
{
'instruction': Give the instruction describing the question.
'input': Occasionally present, detailed instructions accompany the question if available.
'output': Give answers to questions.
}
.
.
.
Prompt Template: Alpaca style prompt
Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
<prompt> (without the <>)
### Input:
<prompt> (if input exists)
### Response:
Fine-tuning Procedure
Our model was finetuned using QLoRA on single A100 80GB GPU. Training details are described in repo.
Benchmark Metrics
"StudentLLM/Alpagasus-2-13b-QLoRA-merged" model performance is uploaded on Huggingface's OpenLLM Leaderboard. Model was evaluated on the tasks specified in HF's Open LLM Leaderboard(ARC, HellaSwag, MMLU, TruthfulQA).
Metric | Value |
---|---|
Avg. | 59.34 |
MMLU | 55.27 |
ARC | 61.09 |
HellaSwag | 82.46 |
TruthfulQA | 38.53 |
LLM Evaluation
Fine-tuning Procedure
Our mod
Citations
@article{chen2023alpagasus,
title={AlpaGasus: Training a Better Alpaca with Fewer Data},
author={Lichang Chen, Shiyang Li, Jun Yan, Hai Wang, Kalpa Gunaratna, Vikas Yadav, Zheng Tang, Vijay Srinivasan, Tianyi Zhou, Heng Huang, Hongxia Jin},
journal={arXiv preprint arXiv:2307.08701},
year={2023}
}