Modified llama.cpp to generate GGUFs for Llama-3_1-Nemotron-51
#22
by
ymcki
- opened
After two weeks of on and off hacking, I successfully modified llama.cpp to convert and run Llama-3_1-Nemotron-51.
https://huggingface.co/ymcki/Llama-3_1-Nemotron-51B-Instruct-GGUF
Feel free to give it a try and let me know if you find anything abnormal.
By the way, I noticed a typo/bug in tokenizer_config.json line 2055 that
"eos_token": "<|eot_id|>",
should be
"eos_token": "<|end_of_text|>",
While transformers allow config.json to override this typo but llama.cpp cannot, so that increased my debugging time...