ciCic commited on
Commit
910d558
·
verified ·
1 Parent(s): ef7d746

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -4
README.md CHANGED
@@ -32,10 +32,10 @@ Starting with
32
 
33
  ### For CUDA users
34
 
35
- AutoAWQ
 
 
36
  ```python
37
- """NOTE: this example uses `fuse_layers=True` to fuse attention and mlp layers together for faster inference"""
38
-
39
  from awq import AutoAWQForCausalLM
40
  from transformers import AutoTokenizer, TextStreamer
41
 
@@ -64,7 +64,8 @@ generation_output = model.generate(
64
  )
65
  ```
66
 
67
- Transformers
 
68
  ```python
69
  from transformers import AutoTokenizer, TextStreamer, AutoModelForCausalLM
70
  import torch
 
32
 
33
  ### For CUDA users
34
 
35
+ **AutoAWQ**
36
+
37
+ NOTE: this example uses `fuse_layers=True` to fuse attention and mlp layers together for faster inference
38
  ```python
 
 
39
  from awq import AutoAWQForCausalLM
40
  from transformers import AutoTokenizer, TextStreamer
41
 
 
64
  )
65
  ```
66
 
67
+ **Transformers**
68
+
69
  ```python
70
  from transformers import AutoTokenizer, TextStreamer, AutoModelForCausalLM
71
  import torch