umizkimt
/

llm-jp-3-13b-it_lora

@@ -21,31 +21,118 @@ This llama model was trained 2x faster with [Unsloth](https://github.com/unsloth
 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
-# llm-jp-3-13b-it-lora
-# Required Libraries and Their Versions
-# Usage
-# Model Details
-# Datasets
-# Pre-training
-# Instruction tuning
-The models have been fine-tuned on the following datasets.
-|Language|Dataset|description|
-|----|----|----|
-|Japanese|ichikara-instruction-003-001-1.json|A manually constructed instruction dataset|
-# Send Questions to
-mitsuhiro.umizaki(at)gmail.com
-# License
-Apache License, Version 2.0

 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
+# umizkimt/llm-jp-3-13b-it_lora - Japanese Instruction-Tuned LLM
+## Overview
+- Base Model: llm-jp/llm-jp-3-13b
+- Purpose: Instruction-tuned model created for the final project of the LLM2024 course, designed for Japanese language tasks
+- Key Features:
+  - Improved Japanese instruction understanding
+  - Preliminary enhancement of Japanese language response generation
+  - 13B parameter model with instruction tuning
+## Model Details
+- Base Model: llm-jp/llm-jp-3-13b
+- Fine-tuning Method: Unsloth LoRA (Low-Rank Adaptation)
+- Training Dataset: ichikara-instruction-003-001-1.json (A manually constructed instruction dataset)
+- Model Parameters: 13 billion
+- Training Environment:
+  - Platform: Google Colaboratory
+  - Hardware: T4 GPU
+  - Training Duration: 46 minutes
+## Performance
+- Omnicampus score: 3.02 (2024-11-29 19:20:27 JST)
+## How to Output the Submitted .jsonl File
+```python
+# 必要なライブラリをインストール
+%%capture
+!pip install unsloth
+!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
+!pip install -U torch
+!pip install -U peft
+# .jsonlファイル出力時のライブラリバージョン
+# unsloth: 2024.11.10
+# peft: 0.13.2
+# torch: 2.5.1+cu121
+# 必要なライブラリを読み込み
+from unsloth import FastLanguageModel
+from peft import PeftModel
+import torch
+import json
+from tqdm import tqdm
+import re
+# ベースとなるモデルと学習したLoRAのアダプタ（Hugging FaceのIDを指定）。
+model_id = "llm-jp/llm-jp-3-13b"
+adapter_id = "umizkimt/llm-jp-3-13b-it_lora"
+# Hugging Face Token を指定。
+HF_TOKEN = "<your-hugging-face-token>"
+# unslothのFastLanguageModelで元のモデルをロード。
+dtype = None # Noneにしておけば自動で設定
+load_in_4bit = True # 今回は13Bモデルを扱うためTrue
+model, tokenizer = FastLanguageModel.from_pretrained(
+    model_name=model_id,
+    dtype=dtype,
+    load_in_4bit=load_in_4bit,
+    trust_remote_code=True,
+)
+# 元のモデルにLoRAのアダプタを統合。
+model = PeftModel.from_pretrained(model, adapter_id, token = HF_TOKEN)
+# タスクとなるデータの読み込み。
+# 事前にデータをアップロードしてください。
+datasets = []
+with open("./elyza-tasks-100-TV_0.jsonl", "r") as f:
+    item = ""
+    for line in f:
+      line = line.strip()
+      item += line
+      if item.endswith("}"):
+        datasets.append(json.loads(item))
+        item = ""
+# モデルを用いてタスクの推論。
+# 推論するためにモデルのモードを変更
+FastLanguageModel.for_inference(model)
+results = []
+for dt in tqdm(datasets):
+  input = dt["input"]
+  prompt = f"""### 指示\n{input}\n### 回答\n"""
+  inputs = tokenizer([prompt], return_tensors = "pt").to(model.device)
+  outputs = model.generate(**inputs, max_new_tokens = 512, use_cache = True, do_sample=False, repetition_penalty=1.2)
+  prediction = tokenizer.decode(outputs[0], skip_special_tokens=True).split('\n### 回答')[-1]
+  results.append({"task_id": dt["task_id"], "input": input, "output": prediction})
+# 結果をjsonlで保存。
+# ここではadapter_idを元にファイル名を決定しているが、ファイル名は任意で問題なし。
+json_file_id = re.sub(".*/", "", adapter_id)
+with open(f"/content/{json_file_id}_output.jsonl", 'w', encoding='utf-8') as f:
+    for result in results:
+        json.dump(result, f, ensure_ascii=False)
+        f.write('\n')
+```
+## Limitaitons
+This model is in early development stages. Outputs may not consistently align with human intent and require careful validation. Potential for generating inappropriate or incorrect responses exists. Recommended for experimental use with human oversight.
+## License
+TBD
+This fine-tuned model is released under the CC BY-NC-SA 4.0 license, as it was trained on a dataset covered by the same license. The pre-trained model used as a starting point for fine-tuning is distributed under the Apache License 2.0.
+## Model Card Authors
+Mitsuhiro Umizaki