metadata
library_name: transformers
tags:
- trismegistus
- llama3
- esoteric
license: llama3.2
datasets:
- teknium/trismegistus-project
base_model:
- meta-llama/Llama-3.2-1B
pipeline_tag: text-generation
Model Details
Model Description
Trismegistus for Llama 3.2 1b. Credits to teknium for dataset and original model.
Model Sources [optional]
Llama 3.2 1b
Uses
- Use for esoteric joy.
Bias, Risks, and Limitations
May be biased as hell.
Recommendation:
- Don't take it personally.
How to Get Started with the Model
- Run it.
Training Data
Training Hyperparameters
- lora 4bit peft
Speeds, Sizes, Times [optional]
- global_step=16905
- training_loss=1.169401215731269
- train_runtime: 21882.4747
- train_samples_per_second: 3.09
- train_steps_per_second: 0.773
- total_flos: 4.437195883099177e+17
- train_loss': 1.169401215731269
- epoch: 5.0
Evaluation and Metrics
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
arc_challenge | 1 | none | 0 | acc | ↑ | 0.3345 | ± | 0.0138 |
none | 0 | acc_norm | ↑ | 0.3695 | ± | 0.0141 | ||
arc_easy | 1 | none | 0 | acc | ↑ | 0.6044 | ± | 0.0100 |
none | 0 | acc_norm | ↑ | 0.5694 | ± | 0.0102 | ||
boolq | 2 | none | 0 | acc | ↑ | 0.6410 | ± | 0.0084 |
hellaswag | 1 | none | 0 | acc | ↑ | 0.4400 | ± | 0.0050 |
none | 0 | acc_norm | ↑ | 0.5728 | ± | 0.0049 | ||
openbookqa | 1 | none | 0 | acc | ↑ | 0.2260 | ± | 0.0187 |
none | 0 | acc_norm | ↑ | 0.3540 | ± | 0.0214 | ||
piqa | 1 | none | 0 | acc | ↑ | 0.7002 | ± | 0.0107 |
none | 0 | acc_norm | ↑ | 0.7024 | ± | 0.0107 | ||
winogrande | 1 | none | 0 | acc | ↑ | 0.5785 | ± | 0.0139 |
Environmental Impact
Will steal your horse and kill your cat.