Update README.md
Browse files
README.md
CHANGED
@@ -152,6 +152,35 @@ ollama run phi3 "Your prompt here"
|
|
152 |
|
153 |
Replace "Your prompt here" with the actual prompt you want to use for generating responses from the model.
|
154 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
155 |
### On the command line, including multiple files at once
|
156 |
|
157 |
I recommend using the `huggingface-hub` Python library:
|
|
|
152 |
|
153 |
Replace "Your prompt here" with the actual prompt you want to use for generating responses from the model.
|
154 |
|
155 |
+
## How to use with Llamafile:
|
156 |
+
|
157 |
+
Assuming that you already have GGUF files downloaded. Here is how you can use the GGUF model with [Llamafile](https://github.com/Mozilla-Ocho/llamafile):
|
158 |
+
|
159 |
+
1. **Download Llamafile-0.7.3**
|
160 |
+
```
|
161 |
+
wget https://github.com/Mozilla-Ocho/llamafile/releases/download/0.7.3/llamafile-0.7.3
|
162 |
+
```
|
163 |
+
2. **Run the model with chat format prompt:**
|
164 |
+
|
165 |
+
|
166 |
+
```markdown
|
167 |
+
<|user|>\nHow to explain Internet for a medieval knight?\n<|end|>\n<|assistant|>
|
168 |
+
```
|
169 |
+
|
170 |
+
|
171 |
+
```
|
172 |
+
./llamafile-0.7.3 -ngl 9999 -m Phi-3-mini-4k-instruct-q4.gguf --temp 0.6 -p "<|user|>\nHow to explain Internet for a medieval knight?\n<|end|>\n<|assistant|>"
|
173 |
+
```
|
174 |
+
|
175 |
+
3. **Run with a chat interface:**
|
176 |
+
|
177 |
+
```
|
178 |
+
./llamafile-0.7.3 -ngl 9999 -m Phi-3-mini-4k-instruct-q4.gguf
|
179 |
+
```
|
180 |
+
|
181 |
+
Your browser should open automatically and display a chat interface. (If it doesn't, just open your browser and point it at http://localhost:8080)
|
182 |
+
|
183 |
+
|
184 |
### On the command line, including multiple files at once
|
185 |
|
186 |
I recommend using the `huggingface-hub` Python library:
|