sakuraumi
/

Sakura-13B-Galgame

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

sakuraumi commited on Oct 25, 2023

Commit

a02e4db

·

1 Parent(s): 0f3e4ac

Update README.md

Files changed (1) hide show

README.md +17 -0

README.md CHANGED Viewed

@@ -216,6 +216,23 @@ pipeline_tag: text-generation
 根据transformers文档中给出的AutoGPTQ量化教程自行量化，或使用我们已经量化好的模型。
 # 微调
 流程与LLaMA2(v0.1-v0.4)/Baichuan2(v0.5+)/Qwen14B(v0.7)一致，prompt构造参考推理部分

 根据transformers文档中给出的AutoGPTQ量化教程自行量化，或使用我们已经量化好的模型。
+使用量化模型推理的示例代码(v0.8与v0.5版本)：
+```python
+from transformers import AutoTokenizer, GenerationConfig
+from auto_gptq import AutoGPTQForCausalLM
+path = "path/to/your/model"
+text = "" #要翻译的文本
+generation_config = GenerationConfig.from_pretrained(path)
+tokenizer = AutoTokenizer.from_pretrained(path, use_fast=False, trust_remote_code=True)
+model = AutoGPTQForCausalLM.from_quantized(path, device="cuda:0", trust_remote_code=True)
+response = tokenizer.decode(model.generate(**tokenizer(f"<reserved_106>将下面的日文文本翻译成中文：{text}<reserved_107>", return_tensors="pt").to(model.device), generation_config=generation_config)[0]).replace("</s>", "").split("<reserved_107>")[1]
+print(response)
+```
 # 微调
 流程与LLaMA2(v0.1-v0.4)/Baichuan2(v0.5+)/Qwen14B(v0.7)一致，prompt构造参考推理部分