--- language: - en tags: - NLP license: mit datasets: - TristanBehrens/metal_cluster - TristanBehrens/metal_arrangement - TristanBehrens/metal_interleaved base_model: None --- # metal_omni_two - An xLSTM Model ![Trained with Helibrunna](banner.jpg) Trained with [Helibrunna](https://github.com/AI-Guru/helibrunna) by [Dr. Tristan Behrens](https://de.linkedin.com/in/dr-tristan-behrens-734967a2). ## Configuration ``` training: model_name: metal_omni_two batch_size: 28 lr: 0.001 lr_warmup_steps: 1445 lr_decay_until_steps: 14455 lr_decay_factor: 0.001 weight_decay: 0.1 amp_precision: bfloat16 weight_precision: float32 enable_mixed_precision: true num_epochs: 5 output_dir: output/metal_omni_two save_every_step: 500 log_every_step: 10 wandb_project: tonnetz torch_compile: false model: type: llamathree context_length: 2048 emb_dim: 256 n_heads: 4 n_layers: 6 hidden_dim: 128 hidden_activation: silu n_kv_groups: 1 rope_base: 50000 rope_freq: null dtype: float32 vocab_size: 269 dataset: hugging_face_ids: - TristanBehrens/metal_cluster - TristanBehrens/metal_arrangement - TristanBehrens/metal_interleaved tokenizer: type: whitespace fill_token: '[EOS]' ```