Post
1031
You are happy that
@Meta
has open-sourced Llama 3 π...
So you jump on @HuggingFace Hub to download the new shiny Llama 3 model only to see a few quintillion Llama 3's! π¦β¨
Which one should you use? π€
Not all Llamas are created equal! π¦βοΈ
An absolutely crazy comparison experiment by Wolfram Ravenwolf ( @Wolfram ) might answer your question! π§ͺπ§ββοΈ
- Comprehensive assessment of Llama 3 Instruct 70B and 8B models. π
- Tested 20 versions across HF, GGUF, and EXL2 formats. π
- Methodology: The process tested translation capabilities and cross-language understanding, using deterministic generation settings to minimize random factors. Used German data protection training exams to evaluate cross-language understanding. ππ
- Best performance from EXL2 4.5bpw quant, scoring perfect in all tests. πβ
- GGUF 8-bit to 4-bit quants also performed exceptionally. π
- Llama 3 8B unquantized is best in its size class but not as good as 70B quants. ππ
- 1-bit quantizations showed significant quality drops. β οΈβ¬οΈ
Best models:
- turboderp/Llama-3-70B-Instruct-exl2
- casperhansen/llama-3-70b-instruct-awq
Blog: https://huggingface.co/blog/wolfram/llm-comparison-test-llama-3
So you jump on @HuggingFace Hub to download the new shiny Llama 3 model only to see a few quintillion Llama 3's! π¦β¨
Which one should you use? π€
Not all Llamas are created equal! π¦βοΈ
An absolutely crazy comparison experiment by Wolfram Ravenwolf ( @Wolfram ) might answer your question! π§ͺπ§ββοΈ
- Comprehensive assessment of Llama 3 Instruct 70B and 8B models. π
- Tested 20 versions across HF, GGUF, and EXL2 formats. π
- Methodology: The process tested translation capabilities and cross-language understanding, using deterministic generation settings to minimize random factors. Used German data protection training exams to evaluate cross-language understanding. ππ
- Best performance from EXL2 4.5bpw quant, scoring perfect in all tests. πβ
- GGUF 8-bit to 4-bit quants also performed exceptionally. π
- Llama 3 8B unquantized is best in its size class but not as good as 70B quants. ππ
- 1-bit quantizations showed significant quality drops. β οΈβ¬οΈ
Best models:
- turboderp/Llama-3-70B-Instruct-exl2
- casperhansen/llama-3-70b-instruct-awq
Blog: https://huggingface.co/blog/wolfram/llm-comparison-test-llama-3