Converted LLaMA from QWEN2-7B-Instruct
Descritpion
This is a converted model from Qwen2-7B-Instruct to LLaMA format. This conversion allows you to use Qwen2-7B-Instruct as if it were a LLaMA model, which is convenient for some inference use cases. The precision is excatly the same as the original model.
Usage
You can load the model using the LlamaForCausalLM
class as shown below:
from transformers import AutoTokenizer, LlamaForCausalLM
prompt = "Give me a short introduction to large language model."
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt}
]
# we still use the original tokenizer from Qwen2-7B-Instruct
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-7B-Instruct")
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text],return_tensors="pt").cuda()
# Converted LlaMA model
llama_model = LlamaForCausalLM.from_pretrained(
"silence09/Qwen2-7B-Instruct-Converted-Llama",
torch_dtype='auto').cuda()
llama_generated_ids = llama_model.generate(model_inputs.input_ids, max_new_tokens=32, do_sample=False)
llama_generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, llama_generated_ids)
]
llama_response = tokenizer.batch_decode(llama_generated_ids, skip_special_tokens=True)[0]
print(llama_response)
Precision Guarantee
To comare result with the original model, you can use this code
More Info
It was converted using the python script available at this repository
- Downloads last month
- 15