Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -47,6 +47,9 @@ output = llm.chat(messages=messages, sampling_params=sampling_params)
 print(output[0].outputs[0].text)
 ```
 ## Serving
 Then we can serve with the following command:
 ```Shell

 print(output[0].outputs[0].text)
 ```
+Note: please use `VLLM_DISABLE_COMPILE_CACHE=1` to disable compile cache when running this code, e.g. `VLLM_DISABLE_COMPILE_CACHE=1 python example.py`, since there are some issues with the composability of compile in vLLM and torchao,
+this is expected be resolved in pytorch 2.8.
 ## Serving
 Then we can serve with the following command:
 ```Shell