Configuration Parsing
Warning:
In config.json: "quantization_config.bits" must be an integer
Cathallama
Awesome model, my new daily driver.
Notable Performance
- 9% overall success rate increase on MMLU-PRO over LLaMA 3.1 70b
- Strong performance in MMLU-PRO categories overall
- Great performance during manual testing
Creation workflow
Models merged
- meta-llama/Meta-Llama-3.1-70B-Instruct
- turboderp/Cat-Llama-3-70B-instruct
- Nexusflow/Athene-70B
flowchart TD
A[Nexusflow_Athene] -->|Merge with| B[Meta-Llama-3.1]
C[turboderp_Cat] -->|Merge with| D[Meta-Llama-3.1]
B -->| | E[Merge]
D -->| | E[Merge]
E[Merge] -->|Result| F[Cathallama]
Testing
Hyperparameters
- Temperature: 0.0 for automated, 0.9 for manual
- Penalize repeat sequence: 1.05
- Consider N tokens for penalize: 256
- Penalize repetition of newlines
- Top-K sampling: 40
- Top-P sampling: 0.95
- Min-P sampling: 0.05
LLaMAcpp Version
- b3527-2-g2d5dd7bb
- -fa -ngl -1 -ctk f16 --no-mmap
Tested Files
- Cathallama-70B.Q4_0.gguf
- Nexusflow_Athene-70B.Q4_0.gguf
- turboderp_Cat-Llama-3-70B-instruct.Q4_0.gguf
- Meta-Llama-3.1-70B-Instruct.Q4_0.gguf
Tests
Manual testing
Category | Test Case | Cathallama-70B.Q4_0.gguf | Nexusflow_Athene-70B.Q4_0.gguf | turboderp_Cat-Llama-3-70B-instruct.Q4_0.gguf | Meta-Llama-3.1-70B-Instruct.Q4_0.gguf |
---|---|---|---|---|---|
Common Sense | Ball on cup | OK | KO | KO | OK |
Big duck small horse | KO | OK | KO | OK | |
Killers | OK | OK | KO | OK | |
Strawberry r's | OK | KO | KO | KO | |
9.11 or 9.9 bigger | KO | OK | OK | KO | |
Dragon or lens | KO | KO | KO | KO | |
Shirts | OK | OK | KO | KO | |
Sisters | OK | KO | KO | KO | |
Jane faster | OK | OK | OK | OK | |
Programming | JSON | OK | OK | OK | OK |
Python snake game | OK | KO | KO | KO | |
Math | Door window combination | OK | OK | KO | KO |
Smoke | Poem | OK | OK | OK | OK |
Story | OK | OK | KO | OK |
Note: See sample_generations.txt on the main folder of the repo for the raw generations.
MMLU-PRO
Model | Success % |
---|---|
Cathallama-70B | 51.0% |
turboderp_Cat-Llama-3-70B-instruct | 37.0% |
Nexusflow_Athene-70B | 41.0% |
Meta-Llama-3.1-70B-Instruct | 42.0% |
MMLU-PRO category | Cathallama-70B.Q4_0.gguf | Nexusflow_Athene-70B.Q4_0.gguf | turboderp_Cat-Llama-3-70B-instruct.Q4_0.gguf | Meta-Llama-3.1-70B-Instruct.Q4_0.gguf |
---|---|---|---|---|
Business | 50.0% | 45.0% | 20.0% | 40.0% |
Law | 40.0% | 30.0% | 30.0% | 35.0% |
Psychology | 85.0% | 80.0% | 70.0% | 75.0% |
Biology | 80.0% | 70.0% | 85.0% | 80.0% |
Chemistry | 55.0% | 40.0% | 35.0% | 35.0% |
History | 65.0% | 60.0% | 55.0% | 65.0% |
Other | 55.0% | 50.0% | 45.0% | 50.0% |
Health | 75.0% | 40.0% | 60.0% | 65.0% |
Economics | 80.0% | 75.0% | 65.0% | 70.0% |
Math | 45.0% | 35.0% | 15.0% | 40.0% |
Physics | 50.0% | 45.0% | 45.0% | 45.0% |
Computer Science | 60.0% | 55.0% | 55.0% | 60.0% |
Philosophy | 55.0% | 60.0% | 45.0% | 50.0% |
Engineering | 35.0% | 40.0% | 25.0% | 35.0% |
Note: MMLU-PRO Overall tested with 100 questions. Categories testes with 20 questions from each category.
PubmedQA
Model Name | Success% |
---|---|
Cathallama-70B.Q4_0.gguf | 73.00% |
turboderp_Cat-Llama-3-70B-instruct.Q4_0.gguf | 76.00% |
Nexusflow_Athene-70B.Q4_0.gguf | 67.00% |
Meta-Llama-3.1-70B-Instruct.Q4_0.gguf | 72.00% |
Request
If you are hiring in the EU or can sponsor a visa, PM me :D
PS. Thank you mradermacher for the GGUFs!
- Downloads last month
- 5
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for blockblockblock/Cathallama-70B-bpw4.4-exl2
Merge model
this model