Add README.md
Browse files
README.md
ADDED
@@ -0,0 +1,49 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# CPT-LoRA_ST-Vicuna-v1.3-3.7B-PPL-q0f16-MLC
|
2 |
+
|
3 |
+
This is the [CPT-LoRA_ST-Vicuna-v1.3-3.7B-PPL](https://huggingface.co/nota-ai/cpt-lora_st-vicuna-v1.3-3.7b-ppl) model in MLC format `q0f16`.
|
4 |
+
The model can be used for projects [MLC-LLM](https://github.com/mlc-ai/mlc-llm) and [WebLLM](https://github.com/mlc-ai/web-llm).
|
5 |
+
|
6 |
+
## Example Usage
|
7 |
+
|
8 |
+
Here are some examples of using this model in MLC LLM.
|
9 |
+
Before running the examples, please install MLC LLM by following the [installation documentation](https://llm.mlc.ai/docs/install/mlc_llm.html#install-mlc-packages).
|
10 |
+
|
11 |
+
### Chat
|
12 |
+
|
13 |
+
In command line, run
|
14 |
+
```bash
|
15 |
+
mlc_llm chat HF://nota-ai/cpt-lora_st-vicuna-v1.3-3.7b-ppl-q0f16-MLC
|
16 |
+
```
|
17 |
+
|
18 |
+
### REST Server
|
19 |
+
|
20 |
+
In command line, run
|
21 |
+
```bash
|
22 |
+
mlc_llm serve HF://nota-ai/cpt-lora_st-vicuna-v1.3-3.7b-ppl-q0f16-MLC
|
23 |
+
```
|
24 |
+
|
25 |
+
### Python API
|
26 |
+
|
27 |
+
```python
|
28 |
+
from mlc_llm import MLCEngine
|
29 |
+
|
30 |
+
# Create engine
|
31 |
+
model = "HF://nota-ai/cpt-lora_st-vicuna-v1.3-3.7b-ppl-q0f16-MLC"
|
32 |
+
engine = MLCEngine(model)
|
33 |
+
|
34 |
+
# Run chat completion in OpenAI API.
|
35 |
+
for response in engine.chat.completions.create(
|
36 |
+
messages=[{"role": "user", "content": "What is the meaning of life?"}],
|
37 |
+
model=model,
|
38 |
+
stream=True,
|
39 |
+
):
|
40 |
+
for choice in response.choices:
|
41 |
+
print(choice.delta.content, end="", flush=True)
|
42 |
+
print("\n")
|
43 |
+
|
44 |
+
engine.terminate()
|
45 |
+
```
|
46 |
+
|
47 |
+
### License
|
48 |
+
- All rights related to this repository and the compressed models are reserved by Nota Inc.
|
49 |
+
- The intended use is strictly limited to research and non-commercial projects.
|