NEBULA-23.8B-v1.0

image/png

Technical notes

  • 108 layers,DUS procedure, mistral(32)->SOLAR(48)->GALAXY(72)->NEBULA(108)
  • 23.8B parameters
  • model created as a extension of depth upscaling procedure used for SOLAR by upstage

Results

  • model can and will produce NSFW content
  • GSM8k evaluation seems to be often broken, HellaSwag, Winograde and TQA show that its a smart model
  • RP and ERP work surprisingly good and I didn't encounter any GPTisms yet
  • lower memory footprint than 20B and 23B models
  • follows character card very well
  • NSFW output feels fresh comparing to existing models

Finetuning for RP

  • SFT using MinervaAI/Aesir-Preview dataset, 10 epochs
  • DPO using athirdpath/DPO_Pairs-Roleplay-Alpaca-NSFW dataset, 1 epoch
  • SFT using 1xAda6000, 10h
  • DPO using 1x3090, 30h
  • jupyter notebooks or mergekit configs for anyone wanting to reproduce/reuse scripts - just drop me a message

Prompt template

  • Alpaca
  • chat template is embedded in tokenizer config, should load automatically

Context size

  • 4096

All comments are greatly appreciated, download, test and if you appreciate my work, consider buying me my fuel: Buy Me A Coffee

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 59.94
AI2 Reasoning Challenge (25-Shot) 66.72
HellaSwag (10-Shot) 86.98
MMLU (5-Shot) 65.40
TruthfulQA (0-shot) 57.60
Winogrande (5-shot) 82.95
GSM8k (5-shot) 0.00
Downloads last month
5
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train TeeZee/NEBULA-23.8B-v1.0-bpw4.0-h6-exl2

Evaluation results