--- license: mit datasets: - allenai/c4 language: - en library_name: transformers --- # Bingus-v0.1-60M-Base A not-so-state-of-the-art 60M parameter transformer model. Uses the olmo default architecture. ### Specs Heads: 8 Layers: 8 Dimension model: 512 Dimension mlp: 4096 eval/v3-small-c4_en-validation/Perplexity: 40.33 ### Training Data Pretraining: - 5B Tokens C4 (preprocessed, from olmo-data.org)