NEBULA-23.8B-v1.0
Technical notes
- 108 layers,DUS procedure, mistral(32)->SOLAR(48)->GALAXY(72)->NEBULA(108)
- 23.8B parameters
- model created as a extension of depth upscaling procedure used for SOLAR by upstage
Results
- model can and will produce NSFW content
- GSM8k evaluation seems to be often broken, HellaSwag, Winograde and TQA show that its a smart model
- RP and ERP work surprisingly good and I didn't encounter any GPTisms yet
- lower memory footprint than 20B and 23B models
- follows character card very well
- NSFW output feels fresh comparing to existing models
Finetuning for RP
- SFT using MinervaAI/Aesir-Preview dataset, 10 epochs
- DPO using athirdpath/DPO_Pairs-Roleplay-Alpaca-NSFW dataset, 1 epoch
- SFT using 1xAda6000, 10h
- DPO using 1x3090, 30h
- jupyter notebooks or mergekit configs for anyone wanting to reproduce/reuse scripts - just drop me a message
Prompt template
- Alpaca
- chat template is embedded in tokenizer config, should load automatically
Context size
- 4096
All comments are greatly appreciated, download, test and if you appreciate my work, consider buying me my fuel:
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 59.94 |
AI2 Reasoning Challenge (25-Shot) | 66.72 |
HellaSwag (10-Shot) | 86.98 |
MMLU (5-Shot) | 65.40 |
TruthfulQA (0-shot) | 57.60 |
Winogrande (5-shot) | 82.95 |
GSM8k (5-shot) | 0.00 |
- Downloads last month
- 5
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Datasets used to train TeeZee/NEBULA-23.8B-v1.0-bpw4.0-h6-exl2
Evaluation results
- normalized accuracy on AI2 Reasoning Challenge (25-Shot)test set Open LLM Leaderboard66.720
- normalized accuracy on HellaSwag (10-Shot)validation set Open LLM Leaderboard86.980
- accuracy on MMLU (5-Shot)test set Open LLM Leaderboard65.400
- mc2 on TruthfulQA (0-shot)validation set Open LLM Leaderboard57.600
- accuracy on Winogrande (5-shot)validation set Open LLM Leaderboard82.950
- accuracy on GSM8k (5-shot)test set Open LLM Leaderboard0.000