Shanshan Wang commited on
Commit
1049ba5
·
1 Parent(s): 3ff7e45

added vllm examples

Browse files
Files changed (2) hide show
  1. README.md +104 -0
  2. assets/a_cat.png +0 -0
README.md CHANGED
@@ -114,6 +114,110 @@ print(f'User: {question}\nAssistant: {response}')
114
 
115
  ```
116
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
117
 
118
  ## Prompt Engineering for JSON Extraction
119
 
 
114
 
115
  ```
116
 
117
+ ### Inference with vLLM
118
+ h2ovl-mississippi models are also supported by vllm [v0.6.4](https://github.com/vllm-project/vllm/releases/tag/v0.6.4) and later version.
119
+
120
+ First install vllm
121
+ ```bash
122
+ pip install vllm
123
+ ```
124
+
125
+ ### Offline inference
126
+ ```python
127
+ from vllm import LLM, SamplingParams
128
+ from transformers import AutoTokenizer
129
+ from PIL import Image
130
+
131
+ question = "Describe this image in detail"
132
+ image = Image.open("assets/a_cat.png")
133
+ model_name = "h2oai/h2ovl-mississippi-2b"
134
+
135
+
136
+ llm = LLM(
137
+ model=model_name,
138
+ )
139
+
140
+ tokenizer = AutoTokenizer.from_pretrained(model_name,
141
+ trust_remote_code=True)
142
+
143
+ messages = [{'role': 'user', 'content': f"<image>\n{question}"}]
144
+ prompt = tokenizer.apply_chat_template(messages,
145
+ tokenize=False,
146
+ add_generation_prompt=True)
147
+
148
+ # Stop tokens for H2OVL-Mississippi
149
+ # https://huggingface.co/h2oai/h2ovl-mississippi-2b
150
+ stop_token_ids = [tokenizer.eos_token_id]
151
+
152
+ sampling_params = SamplingParams(n=1,
153
+ temperature=0.8,
154
+ top_p=0.8,
155
+ seed=777, # Seed for reprodicibility
156
+ max_tokens=1024,
157
+ stop_token_ids=stop_token_ids)
158
+
159
+ # Single prompt inference
160
+ outputs = llm.generate({
161
+ "prompt": prompt,
162
+ "multi_modal_data": {"image": image},
163
+ },
164
+ sampling_params=sampling_params)
165
+
166
+ # look at the output
167
+ for o in outputs:
168
+ generated_text = o.outputs[0].text
169
+ print(generated_text)
170
+
171
+ ```
172
+ Pleaes see more examples at https://docs.vllm.ai/en/latest/models/vlm.html#offline-inference
173
+
174
+
175
+
176
+ ### Online inference with OpenAI-Compatible Vision API
177
+ Run the following command to start the vLLM server with the h2ovl-mississippi-2b model:
178
+ ```bash
179
+ vllm serve h2oai/h2ovl-mississippi-2b --dtype auto --api-key token-abc123
180
+ ```
181
+
182
+ ```python
183
+ from openai import OpenAI
184
+ client = OpenAI(
185
+ base_url="http://0.0.0.0:8000/v1",
186
+ api_key="token-abc123",
187
+ )
188
+
189
+ # check the model name
190
+ model_name = client.models.list().data[0].id
191
+ print(model_name)
192
+
193
+ # use chat completion api
194
+ response = client.chat.completions.create(
195
+ model=model_name,
196
+ messages=[{
197
+ 'role':
198
+ 'user',
199
+ 'content': [{
200
+ 'type': 'text',
201
+ 'text': 'describe this image in detail',
202
+ }, {
203
+ 'type': 'image_url',
204
+ 'image_url': {
205
+ 'url':
206
+ # an image example from https://galaxyofai.com/opencv-with-python-full-tutorial-for-data-science/
207
+ # this is a cat
208
+ 'https://galaxyofai.com/wp-content/uploads/2023/04/image-42.png',
209
+ },
210
+ }],
211
+ }],
212
+ temperature=0.8,
213
+ top_p=0.8)
214
+ print(response)
215
+
216
+
217
+ ```
218
+ Please see more examples at https://docs.vllm.ai/en/latest/models/vlm.html#online-inference
219
+
220
+
221
 
222
  ## Prompt Engineering for JSON Extraction
223
 
assets/a_cat.png ADDED