Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -18,6 +18,10 @@ tags:
 # BETA Quality, not sufficiently tested yet.
 # Quant Infos
 - quants done with an importance matrix for improved quantization loss

 # BETA Quality, not sufficiently tested yet.
+Seems to work mostly fine in my testing with the fixes from ggerganov's [PR](https://github.com/ggerganov/llama.cpp/pull/6851) applied, although it does seem to output extra <|end|> tokens at the end of responses when using llama.cpp's /v1/chat/completions endpoint
+Additionally the chat template is not supported by llama.cpp yet so make sure to invoke it correctly, I made a PR for this [here](https://github.com/ggerganov/llama.cpp/pull/6857)
 # Quant Infos
 - quants done with an importance matrix for improved quantization loss