ferran-espuna
commited on
Commit
•
915111f
1
Parent(s):
b68e373
Update README.md
Browse files
README.md
CHANGED
@@ -62,7 +62,38 @@ This model card corresponds to the gptq-quantized version of Salamandra-7b-instr
|
|
62 |
The entire Salamandra family is released under a permissive [Apache 2.0 license]((https://www.apache.org/licenses/LICENSE-2.0)).
|
63 |
|
64 |
|
65 |
-
## Additional information
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
66 |
|
67 |
### Author
|
68 |
International Business Machines (IBM).
|
|
|
62 |
The entire Salamandra family is released under a permissive [Apache 2.0 license]((https://www.apache.org/licenses/LICENSE-2.0)).
|
63 |
|
64 |
|
65 |
+
## Additional information## How to Use
|
66 |
+
|
67 |
+
The following example code works under ``Python 3.9.16``, ``vllm==0.6.3.post1``, ``torch==2.4.0`` and ``torchvision==0.19.0``, though it should run on
|
68 |
+
any current version of the libraries. This is an example of a conversational chatbot using the model:
|
69 |
+
|
70 |
+
```
|
71 |
+
from vllm import LLM, SamplingParams
|
72 |
+
|
73 |
+
model_name = "BSC-LT/salamandra-7b-instruct-gptq"
|
74 |
+
llm = LLM(model=model_name)
|
75 |
+
|
76 |
+
messages = []
|
77 |
+
|
78 |
+
while True:
|
79 |
+
user_input = input("user >> ")
|
80 |
+
if user_input.lower() == "exit":
|
81 |
+
print("Chat ended.")
|
82 |
+
break
|
83 |
+
|
84 |
+
messages.append({'role': 'user', 'content': user_input})
|
85 |
+
|
86 |
+
outputs = llm.chat(messages,
|
87 |
+
sampling_params=SamplingParams(
|
88 |
+
temperature=0.5,
|
89 |
+
stop_token_ids=[5],
|
90 |
+
max_tokens=200)
|
91 |
+
)[0].outputs
|
92 |
+
|
93 |
+
model_output = outputs[0].text
|
94 |
+
print(f'assistant >> {model_output}')
|
95 |
+
messages.append({'role': 'assistant', 'content': model_output})
|
96 |
+
```
|
97 |
|
98 |
### Author
|
99 |
International Business Machines (IBM).
|