Update README.md
Browse files
README.md
CHANGED
@@ -19,7 +19,10 @@ In order to run optimized Mamba implementations on a CUDA device, you first need
|
|
19 |
pip install mamba-ssm causal-conv1d>=1.2.0
|
20 |
```
|
21 |
|
22 |
-
You can run the model not using the optimized Mamba kernels, but it is **not** recommended as it will result in significantly higher latency.
|
|
|
|
|
|
|
23 |
|
24 |
## Inference
|
25 |
|
|
|
19 |
pip install mamba-ssm causal-conv1d>=1.2.0
|
20 |
```
|
21 |
|
22 |
+
You can run the model not using the optimized Mamba kernels, but it is **not** recommended as it will result in significantly higher latency.
|
23 |
+
|
24 |
+
To run on CPU, please specify `use_mamba_kernels=False` when loading the model using ``AutoModelForCausalLM.from_pretrained``.
|
25 |
+
|
26 |
|
27 |
## Inference
|
28 |
|