Model Details

Model Description

Trismegistus for Llama 3.2 1b. Credits to teknium for dataset and original model.

Model Sources [optional]

Llama 3.2 1b

Uses

  • Use for esoteric joy.

Bias, Risks, and Limitations

  • May be biased as hell.

  • Recommendation:

    • Don't take it personally.

How to Get Started with the Model

  • Run it.

Training Data

Training Hyperparameters

  • lora 4bit peft

Speeds, Sizes, Times [optional]

  • global_step=16905
  • training_loss=1.169401215731269
  • train_runtime: 21882.4747
  • train_samples_per_second: 3.09
  • train_steps_per_second: 0.773
  • total_flos: 4.437195883099177e+17
  • train_loss': 1.169401215731269
  • epoch: 5.0

Evaluation and Metrics

Tasks Version Filter n-shot Metric Value Stderr
arc_challenge 1 none 0 acc ↑ 0.3345 ± 0.0138
none 0 acc_norm ↑ 0.3695 ± 0.0141
arc_easy 1 none 0 acc ↑ 0.6044 ± 0.0100
none 0 acc_norm ↑ 0.5694 ± 0.0102
boolq 2 none 0 acc ↑ 0.6410 ± 0.0084
hellaswag 1 none 0 acc ↑ 0.4400 ± 0.0050
none 0 acc_norm ↑ 0.5728 ± 0.0049
openbookqa 1 none 0 acc ↑ 0.2260 ± 0.0187
none 0 acc_norm ↑ 0.3540 ± 0.0214
piqa 1 none 0 acc ↑ 0.7002 ± 0.0107
none 0 acc_norm ↑ 0.7024 ± 0.0107
winogrande 1 none 0 acc ↑ 0.5785 ± 0.0139

Environmental Impact

Will steal your horse and kill your cat.

Downloads last month
31
Safetensors
Model size
1.24B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for jtatman/llama-3.2-1b-trismegistus

Finetuned
(183)
this model
Quantizations
1 model

Dataset used to train jtatman/llama-3.2-1b-trismegistus