Safetensors
English
Chinese
plm
custom_code
daven3 commited on
Commit
3299de4
·
verified ·
1 Parent(s): 4812535

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -1
README.md CHANGED
@@ -96,6 +96,27 @@ PLM-1.8B is a strong and reliable model, particularly in basic knowledge underst
96
 
97
  ## How to use PLM
98
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
99
  ### llama.cpp
100
 
101
  The original contribution to the llama.cpp framwork is [Si1w/llama.cpp](https://github.com/Si1w/llama.cpp). Here is the usage:
@@ -106,7 +127,7 @@ cd llama.cpp
106
  pip install -r requirements.txt
107
  ```
108
 
109
- Then we can build with CPU of GPU (e.g. Orin). The build is based on `cmake`.
110
 
111
  - For CPU
112
 
@@ -122,6 +143,18 @@ cmake -B build -DGGML_CUDA=ON
122
  cmake --build build --config Release
123
  ```
124
 
 
 
 
 
 
 
 
 
 
 
 
 
125
  ## Future works
126
 
127
  - [ ] Release vLLM, SGLang, and PowerInfer inference scripts for PLM.
 
96
 
97
  ## How to use PLM
98
 
99
+ Here we introduce some methods to use PLM models.
100
+
101
+ ### Hugging Face
102
+
103
+ ```python
104
+ import torch
105
+ from transformers import AutoTokenizer, AutoModelForCausalLM
106
+
107
+ # Load model and tokenizer
108
+ tokenizer = AutoTokenizer.from_pretrained("PLM-Team/PLM-1.8B-Instruct")
109
+ model = AutoModelForCausalLM.from_pretrained("PLM-Team/PLM-1.8B-Instruct", torch_dtype=torch.bfloat16)
110
+
111
+ # Input text
112
+ input_text = "Tell me something about reinforcement learning."
113
+ inputs = tokenizer(input_text, return_tensors="pt")
114
+
115
+ # Completion
116
+ output = model.generate(inputs["input_ids"], max_new_tokens=100)
117
+ print(tokenizer.decode(output[0], skip_special_tokens=True))
118
+ ```
119
+
120
  ### llama.cpp
121
 
122
  The original contribution to the llama.cpp framwork is [Si1w/llama.cpp](https://github.com/Si1w/llama.cpp). Here is the usage:
 
127
  pip install -r requirements.txt
128
  ```
129
 
130
+ Then, we can build with CPU of GPU (e.g. Orin). The build is based on `cmake`.
131
 
132
  - For CPU
133
 
 
143
  cmake --build build --config Release
144
  ```
145
 
146
+ Don't forget to download the GGUF files of the PLM. We use the quantization methods in `llama.cpp` to generate the quantized PLM.
147
+
148
+ ```bash
149
+ huggingface-cli download --resume-download PLM-Team/PLM-1.8B-Instruct-gguf --local-dir PLM-Team/PLM-1.8B-Instruct-gguf
150
+ ```
151
+
152
+ After build the `llama.cpp`, we can use `llama-cli` script to launch the PLM.
153
+
154
+ ```bash
155
+ ./build/bin/llama-cli -m ./PLM-Team/PLM-1.8B-Instruct-gguf/PLM-1.8B-Instruct-Q8_0.gguf -cnv -p "hello!" -n 128
156
+ ```
157
+
158
  ## Future works
159
 
160
  - [ ] Release vLLM, SGLang, and PowerInfer inference scripts for PLM.