infosys
/

NT-Java-1.1B-GGUF

Text Generation

NarrowTransformer

Model card Files Files and versions

rajabmondal commited on Jun 22, 2024

Commit

fb40fb1

·

verified ·

1 Parent(s): 552d0c1

Update README.md

Files changed (1) hide show

README.md +29 -0

README.md CHANGED Viewed

@@ -152,6 +152,35 @@ ollama run phi3 "Your prompt here"
 Replace "Your prompt here" with the actual prompt you want to use for generating responses from the model.
 ### On the command line, including multiple files at once
 I recommend using the `huggingface-hub` Python library:

 Replace "Your prompt here" with the actual prompt you want to use for generating responses from the model.
+## How to use with Llamafile:
+Assuming that you already have GGUF files downloaded. Here is how you can use the GGUF model with [Llamafile](https://github.com/Mozilla-Ocho/llamafile):
+1. **Download Llamafile-0.7.3**
+```
+wget https://github.com/Mozilla-Ocho/llamafile/releases/download/0.7.3/llamafile-0.7.3
+```
+2. **Run the model with chat format prompt:**
+```markdown
+<|user|>\nHow to explain Internet for a medieval knight?\n<|end|>\n<|assistant|>
+```
+```
+./llamafile-0.7.3 -ngl 9999 -m Phi-3-mini-4k-instruct-q4.gguf --temp 0.6 -p "<|user|>\nHow to explain Internet for a medieval knight?\n<|end|>\n<|assistant|>"
+```
+3. **Run with a chat interface:**
+```
+./llamafile-0.7.3 -ngl 9999 -m Phi-3-mini-4k-instruct-q4.gguf
+```
+Your browser should open automatically and display a chat interface. (If it doesn't, just open your browser and point it at http://localhost:8080)
 ### On the command line, including multiple files at once
 I recommend using the `huggingface-hub` Python library: