DataPilot
/

Arrival-32B-Instruct-v0.1

Model card Files Files and versions Community

Holy-fox commited on Jan 27

Commit

8b86741

·

verified ·

1 Parent(s): 9c6101e

Update README.md

Files changed (1) hide show

README.md +60 -3

README.md CHANGED Viewed

@@ -1,3 +1,60 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+language:
+- ja
+- en
+base_model:
+- Qwen/Qwen2.5-32B-Instruct
+- abeja/ABEJA-Qwen2.5-32b-Japanese-v0.1
+- cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese
+---
+## 概要
+このモデルはDeepSeek社のR1蒸留モデルである(deepseek-ai/DeepSeek-R1-Distill-Qwen-32B)[https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B]を日本語ファインチューニングしたcyber agent社の(cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese)[https://huggingface.co/cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese]に対してAbeja社の(abeja/ABEJA-Qwen2.5-32b-Japanese-v0.1)[https://huggingface.co/abeja/ABEJA-Qwen2.5-32b-Japanese-v0.1]をChatVectorを用いて加えたものに、独自の日本語強化ファインチューニングをしたモデルとなります。
+## How to use
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "DataPilot/Arrival-32B-Instruct-v0.1"
+tokenizer_name = ""
+if tokenizer_name == "":
+    tokenizer_name = model_name
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    torch_dtype="auto",
+    device_map="auto"
+)
+tokenizer = AutoTokenizer.from_pretrained(tokenizer_name)
+prompt = "9.9と9.11はどちらのほうが大きいですか？"
+messages = [
+    {"role": "system", "content": "あなたは優秀な日本語アシスタントであり長考モデルです。問題解決をするための思考をした上で回答を行ってください。"},
+    {"role": "user", "content": prompt}
+]
+text = tokenizer.apply_chat_template(
+    messages,
+    tokenize=False,
+    add_generation_prompt=True
+)
+model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
+generated_ids = model.generate(
+    **model_inputs,
+    max_new_tokens=1024
+)
+generated_ids = [
+    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
+]
+response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
+print(response)
+```
+## 謝辞
+モデルの作成者であるDeepSeekチーム, Qwenチーム, Abejaチーム, CyberAgentチームに感謝を申し上げます。
+また、計算資源を貸していただいたVOLTMINDにも感謝を申し上げます。