Aether-12b

Aether-12b is a fine-tuned large language model based on Arcanum-12b, further trained on the CleverBoi-Data-20k dataset.

Model Details πŸ“Š

Model Architecture πŸ—οΈ

  • Base model: Arcanum-12b
  • Parameter count: ~12 billion
  • Architecture specifics: Transformer-based language model

Open LLM Leaderboard Evaluation Results

Coming Soon !

Training & Fine-tuning πŸ”„

Aether-12b was fine-tuned on the following dataset:

  • Dataset: theprint/CleverBoi-Data-20k
  • Fine-tuning method: TRL SFTTrainer with AdamW optimizer, cosine decay LR scheduler, bfloat16 precision.

The CleverBoi-Data-20k dataset improved the model in the following ways:

  1. Enhanced reasoning and problem-solving capabilities
  2. Broader knowledge across various topics
  3. Improved performance on specific tasks like writing, analysis, and problem-solving
  4. Better contextual understanding and response generation

Intended Use 🎯

As an assistant or specific role bot.

Ethical Considerations πŸ€”

As a fine-tuned model based on Arcanum-12b, this model may inherit biases and limitations from its parent model and the fine-tuning dataset. Users should be aware of potential biases in generated content and use the model responsibly.

Acknowledgments πŸ™

We acknowledge the contributions of:

  • theprint for the amazing CleverBoi-Data-20k dataset
Downloads last month
22
Safetensors
Model size
12.2B params
Tensor type
BF16
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for aixonlab/Aether-12b

Base model

Xclbr7/Arcanum-12b
Finetuned
(2)
this model
Finetunes
1 model
Quantizations
3 models

Collection including aixonlab/Aether-12b