File size: 1,912 Bytes
fb016ec 0a4e6ee fb016ec 0a4e6ee fb016ec 0a4e6ee fb016ec e9da1a4 0a4e6ee a522955 0a4e6ee a522955 0a4e6ee a522955 0a4e6ee a522955 0a4e6ee a522955 0a4e6ee a522955 0a4e6ee a522955 fb016ec 0a4e6ee a522955 0a4e6ee a522955 0a4e6ee a522955 0a4e6ee a522955 0a4e6ee a522955 0a4e6ee a522955 0a4e6ee a522955 0a4e6ee a522955 0a4e6ee a522955 0a4e6ee |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
---
language:
- en
- vi
- zh
base_model:
- google/gemma-2-2b-it
pipeline_tag: text-generation
tags:
- vllm
- system-role
- langchain
license: gemma
---
# gemma-2-9b-it-fix-system-role
Modified version of [gemma-2-2b-it](https://huggingface.co/google/gemma-2-9b-it) and update **`chat_template`** for support **`system`** role to handle cases:
- `Conversation roles must alternate user/assistant/user/assistant/...`
- `System role not supported`
## Model Overview
- **Model Architecture:** Gemma 2
- **Input:** Text
- **Output:** Text
- **Release Date:** 04/12/2024
- **Version:** 1.0
## Deployment
### Use with vLLM
This model can be deployed efficiently using the [vLLM](https://docs.vllm.ai/en/latest/) backend, as shown in the example below.
With CLI:
```bash
vllm serve --model dangvansam/gemma-2-2b-it-fix-system-role
```
```bash
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "dangvansam/gemma-2-2b-it-fix-system-role",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who are you?"}
]
}'
```
With Python:
```python
from vllm import LLM, SamplingParams
from transformers import AutoTokenizer
model_id = "dangvansam/gemma-2-2b-it-fix-system-role"
sampling_params = SamplingParams(temperature=0.6, top_p=0.9, max_tokens=256)
tokenizer = AutoTokenizer.from_pretrained(model_id)
messages = [
{"role": "system", "content": "You are helpfull assistant."},
{"role": "user", "content": "Who are you?"}
]
prompts = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
llm = LLM(model=model_id)
outputs = llm.generate(prompts, sampling_params)
generated_text = outputs[0].outputs[0].text
print(generated_text)
```
vLLM also supports OpenAI-compatible serving. See the [documentation](https://docs.vllm.ai/en/latest/) for more details. |