T145
/

Llama-3.1-8B-Instruct-Zeus

Text Generation

function calling

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

T145 commited on Nov 30, 2024

Commit

ca27044

·

verified ·

1 Parent(s): 0081c80

Updated notes

Files changed (1) hide show

README.md +5 -1

README.md CHANGED Viewed

@@ -109,7 +109,10 @@ model-index:
 ---
 # ZEUS
-This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 ## Merge Details
@@ -169,3 +172,4 @@ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-le
 |MuSR (0-shot)      | 8.57|
 |MMLU-PRO (5-shot)  |32.14|

 ---
 # ZEUS
+Inspired by [Dampfinchen/Llama-3.1-8B-Ultra-Instruct](https://huggingface.co/Dampfinchen/Llama-3.1-8B-Ultra-Instruct),
+the goal of this merge was to create an abliterated, conversational AI restricted to 8B parameters that's coherent over long conversations.
+After testing "Ultra-Instruct" with various parameters, its grammar in responses would degrade over time.
+While more extensive testing still needs to be done, prelimary results seem to show these problems are fixed.
 ## Merge Details
 |MuSR (0-shot)      | 8.57|
 |MMLU-PRO (5-shot)  |32.14|
+* Falls about 1 point behind "Ultra-Instruct" on IFEval and BBH, but everything else is a significant improvement.