### Use with transformers

Starting with `transformers >= 4.45.0` onward, you can run conversational inference using the Transformers `pipeline` abstraction or by leveraging the Auto classes with the `generate()` function.

Make sure to update your transformers installation via `pip install --upgrade transformers`.

See the snippet below for usage with Transformers:

```python
import transformers
import torch

model_id = "suzii/Llama-3.2-3B-MIS"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "system", "content": "Bạn là một chatbot hỗ trợ các vấn đề về hệ thống thông tin quản lý. Chỉ được phép trả lời các câu hỏi liên quan đến hệ thống thông tin quản lý. Các câu khác hãy trả lời: tôi không biết. Chỉ cần tập trung trả lời câu hỏi một cách chi tiết và chính xác nhất có thể."},
    {"role": "user", "content": "MIS là gì?"},
]

outputs = pipeline(
    messages,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])
```