Update README.md
Browse files
README.md
CHANGED
@@ -86,4 +86,68 @@ print(outputs[0]["generated_text"])
|
|
86 |
- He played a significant role in Singapore's rapid development, transforming the country from a poor and undeveloped nation into a modern and prosperous city-state.
|
87 |
- Lee passed away in 2015, at the age of 91.
|
88 |
- He was widely regarded as one of the most influential leaders of the 20th century and a key figure in the history of Singapore.
|
89 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
86 |
- He played a significant role in Singapore's rapid development, transforming the country from a poor and undeveloped nation into a modern and prosperous city-state.
|
87 |
- Lee passed away in 2015, at the age of 91.
|
88 |
- He was widely regarded as one of the most influential leaders of the 20th century and a key figure in the history of Singapore.
|
89 |
+
```
|
90 |
+
|
91 |
+
### 4-bit Inferencing Example
|
92 |
+
|
93 |
+
```python
|
94 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
|
95 |
+
import transformers
|
96 |
+
import torch
|
97 |
+
|
98 |
+
#!nvidia-smi
|
99 |
+
|
100 |
+
"""
|
101 |
+
Wed Feb 7 12:51:07 2024
|
102 |
+
+---------------------------------------------------------------------------------------+
|
103 |
+
| NVIDIA-SMI 535.154.05 Driver Version: 535.154.05 CUDA Version: 12.2 |
|
104 |
+
|-----------------------------------------+----------------------+----------------------+
|
105 |
+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
|
106 |
+
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
|
107 |
+
| | | MIG M. |
|
108 |
+
|=========================================+======================+======================|
|
109 |
+
| 0 Tesla V100-SXM2-16GB On | 00000000:00:1E.0 Off | 0 |
|
110 |
+
| N/A 41C P0 44W / 300W | 4950MiB / 16384MiB | 0% Default |
|
111 |
+
| | | N/A |
|
112 |
+
+-----------------------------------------+----------------------+----------------------+
|
113 |
+
"""
|
114 |
+
|
115 |
+
model_id = "lxyuan/AeolusBlend-7B-slerp"
|
116 |
+
|
117 |
+
bnb_config = BitsAndBytesConfig(
|
118 |
+
load_in_4bit=True,
|
119 |
+
bnb_4bit_use_double_quant=True,
|
120 |
+
bnb_4bit_quant_type="nf4",
|
121 |
+
bnb_4bit_compute_dtype=torch.bfloat16
|
122 |
+
)
|
123 |
+
|
124 |
+
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map="auto")
|
125 |
+
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
126 |
+
|
127 |
+
pipeline = transformers.pipeline(
|
128 |
+
"text-generation",
|
129 |
+
model=model,
|
130 |
+
tokenizer=tokenizer,
|
131 |
+
device_map="auto",
|
132 |
+
)
|
133 |
+
|
134 |
+
messages = [{"role": "user", "content": "What is a large language model?"}]
|
135 |
+
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
|
136 |
+
|
137 |
+
outputs = pipeline(prompt, max_new_tokens=2048, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
|
138 |
+
|
139 |
+
print(outputs[0]["generated_text"])
|
140 |
+
|
141 |
+
>>>
|
142 |
+
<s>[INST] What is a large language model? [/INST]
|
143 |
+
|
144 |
+
A large language model is a type of artificial intelligence system that has been trained on vast amounts of
|
145 |
+
text data, enabling it to generate human-like responses to a wide range of written prompts. These models are
|
146 |
+
designed to learn the patterns and rules of language, and as a result, they can perform various natural
|
147 |
+
language processing tasks, such as translation, summarization, and question-answering, with a high degree
|
148 |
+
of accuracy. Large language models are typically powered by deep learning algorithms and can have billions
|
149 |
+
or trillions of parameters, making them capable of processing and understanding complex language structures
|
150 |
+
and nuances. Some well-known examples of large language models include GPT-3, BERT, and T5.
|
151 |
+
```
|
152 |
+
|
153 |
+
- 4bit Inference Example notebook can be found [here](https://github.com/LxYuan0420/nlp/blob/main/notebooks/Inference_4bit_AeolusBlend.ipynb)
|