llama-7b-logicot / README.md
datatune's picture
Adding Evaluation Results (#1)
20043ab verified
metadata
license: cc-by-sa-4.0
datasets:
  - csitfun/LogiCoT
language:
  - en
library_name: transformers
pipeline_tag: text-generation
tags:
  - logical

This model is tuned on the LogiCoT data and the GPT-4 alpaca data with the LLaMa-7b model.

We use 2 A100 GPUs

We first instruction-tuning LLaMa-7b on the GPT-4 alpaca data for 3 days, then on the LogiCoT data for 4 days.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 39.37
ARC (25-shot) 47.01
HellaSwag (10-shot) 72.56
MMLU (5-shot) 38.93
TruthfulQA (0-shot) 43.63
Winogrande (5-shot) 67.56
GSM8K (5-shot) 0.0
DROP (3-shot) 5.92