BEE-spoke-data/smol_llama-101M-GQA-GGUF

Quantized GGUF model files for smol_llama-101M-GQA from BEE-spoke-data

Original Model Card:

smol_llama-101M-GQA

banner

A small 101M param (total) decoder model. This is the first version of the model.

  • 768 hidden size, 6 layers
  • GQA (24 heads, 8 key-value), context length 1024
  • train-from-scratch

Notes

This checkpoint is the 'raw' pre-trained model and has not been tuned to a more specific task. It should be fine-tuned before use in most cases.

Checkpoints & Links

  • smol-er 81M parameter checkpoint with in/out embeddings tied: here
  • Fine-tuned on pypi to generate Python code - link
  • For the chat version of this model, please see here

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 25.32
ARC (25-shot) 23.55
HellaSwag (10-shot) 28.77
MMLU (5-shot) 24.24
TruthfulQA (0-shot) 45.76
Winogrande (5-shot) 50.67
GSM8K (5-shot) 0.83
DROP (3-shot) 3.39
Downloads last month
27
GGUF
Model size
101M params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples
Inference API (serverless) has been turned off for this model.

Model tree for afrideva/smol_llama-101M-GQA-GGUF

Quantized
(1)
this model

Datasets used to train afrideva/smol_llama-101M-GQA-GGUF