File size: 1,078 Bytes
b5edd29 593e89f 8ab4ab4 593e89f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
---
license: other
license_name: internlm-license
license_link: https://huggingface.co/internlm/internlm-chat-7b-v1_1
---
internlm-chat-7b-v1_1をGPTQ変換したモデルです<br>
利用に当たってはhttps://huggingface.co/internlm/internlm-chat-7b-v1_1 のライセンスに従って下さい<br>
<br>
推論用コード<br>
```
import torch
import time
from transformers import AutoTokenizer, AutoModelForCausalLM,GPTQConfig
model_path = r".\internlm-chat-7b-v1_1-gptq"
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
gptq_config = GPTQConfig(bits= 4 , disable_exllama= True )
model = AutoModelForCausalLM.from_pretrained( model_path , device_map= "auto" , quantization_config = gptq_config,trust_remote_code=True)
model = model.eval()
history = []
while True:
txt = input("msg:")
start_time = time.perf_counter()
response, history = model.chat(tokenizer, txt, history=history)
print(response)
end_time = time.perf_counter()
elapsed_time = end_time - start_time
print(f"worktime:{elapsed_time}")
```
|