JaaackXD
/

Llama-3-70B-GGUF

Inference Endpoints

Model card Files Files and versions Community

JaaackXD commited on Jun 10, 2024

Commit

5d2fd9b

·

verified ·

1 Parent(s): 466af23

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -13,6 +13,8 @@ Including the original LLaMA 3 models file cloning from the Meta HF repo. (https
 If you have issues downloading the models from Meta or converting models for `llama.cpp`, feel free to download this one!
 ## Perplexity table on LLaMA 3 70B
 Less perplexity is better. (credit to: [dranger003](https://github.com/ggerganov/llama.cpp/pull/6745#issuecomment-2093892514))

 If you have issues downloading the models from Meta or converting models for `llama.cpp`, feel free to download this one!
+### How to use the `gguf-split` / Model sharding demo : https://github.com/ggerganov/llama.cpp/discussions/6404
 ## Perplexity table on LLaMA 3 70B
 Less perplexity is better. (credit to: [dranger003](https://github.com/ggerganov/llama.cpp/pull/6745#issuecomment-2093892514))