File size: 5,896 Bytes

---
base_model: llm-jp/llm-jp-3-13b
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
license: apache-2.0
language:
- en
---

# Uploaded  model

- **Developed by:** umizkimt
- **License:** CC BY-NC-SA 4.0
- **Finetuned from model :** llm-jp/llm-jp-3-13b

This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

# umizkimt/llm-jp-3-13b-it_lora - Japanese Instruction-Tuned LLM

## Overview
- Base Model: llm-jp/llm-jp-3-13b
- Purpose: Instruction-tuned model created for the final project of the LLM2024 course, designed for Japanese language tasks
- Key Features:
  - Improved Japanese instruction understanding
  - Preliminary enhancement of Japanese language response generation
  - 13B parameter model with instruction tuning

## Model Details
- Base Model: llm-jp/llm-jp-3-13b
- Fine-tuning Method: Unsloth LoRA (Low-Rank Adaptation)
- Training Dataset: ichikara-instruction-003-001-1.json (A manually constructed instruction dataset)
- Model Parameters: 13 billion
- Training Environment:
  - Platform: Google Colaboratory
  - Hardware: T4 GPU
  - Training Duration: 46 minutes

## Performance
|Metric|Base Model|Fine-Tuned Model|
|---|---|---|
|Score (Gemini 1.5)|2.21|3.01|
|Inference Time (100 examples)|38 minutes|9 minutes|

- Score Type: Provisional score using Gemini 1.5 (for competition purposes)
- Evaluation Dataset: elyza-tasks-100-TV_0.jsonl
- Platform: Google Colaboratory (T4 GPU)

## .jsonl File Output Usage
To generate the output file in Google Colaboratory, use the following script:

```python
# 必要なライブラリをインストール
%%capture
!pip install unsloth
!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install -U torch
!pip install -U peft

# Omnicampusに提出した.jsonlファイル出力時のライブラリバージョン
# unsloth: 2024.11.10
# peft: 0.13.2
# torch: 2.5.1+cu121
# tqdm: 4.66.6

# 必要なライブラリを読み込み
from unsloth import FastLanguageModel
from peft import PeftModel
import torch
import json
from tqdm import tqdm
import re

# ベースとなるモデルと学習したLoRAのアダプタ（Hugging FaceのIDを指定）。
model_id = "llm-jp/llm-jp-3-13b"
adapter_id = "umizkimt/llm-jp-3-13b-it_lora"

# Hugging Face Token を指定。
HF_TOKEN = "<YOUR_HF_TOKEN>"

# unslothのFastLanguageModelで元のモデルをロード。
dtype = None # Noneにしておけば自動で設定
load_in_4bit = True # 今回は13Bモデルを扱うためTrue

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=model_id,
    dtype=dtype,
    load_in_4bit=load_in_4bit,
    trust_remote_code=True,
)

# 元のモデルにLoRAのアダプタを統合。
model = PeftModel.from_pretrained(model, adapter_id, token = HF_TOKEN)

# タスクとなるデータの読み込み。
# 事前にデータをアップロードしてください。
datasets = []
with open("./elyza-tasks-100-TV_0.jsonl", "r") as f:
    item = ""
    for line in f:
      line = line.strip()
      item += line
      if item.endswith("}"):
        datasets.append(json.loads(item))
        item = ""

# モデルを用いてタスクの推論。

# 推論するためにモデルのモードを変更
FastLanguageModel.for_inference(model)

results = []
for dt in tqdm(datasets):
  input = dt["input"]

  prompt = f"""### 指示\n{input}\n### 回答\n"""

  inputs = tokenizer([prompt], return_tensors = "pt").to(model.device)

  outputs = model.generate(**inputs, max_new_tokens = 512, use_cache = True, do_sample=False, repetition_penalty=1.2)
  prediction = tokenizer.decode(outputs[0], skip_special_tokens=True).split('\n### 回答')[-1]

  results.append({"task_id": dt["task_id"], "input": input, "output": prediction})

# 結果をjsonlで保存。

# ここではadapter_idを元にファイル名を決定しているが、ファイル名は任意で問題なし。
json_file_id = re.sub(".*/", "", adapter_id)
with open(f"/content/{json_file_id}_output.jsonl", 'w', encoding='utf-8') as f:
    for result in results:
        json.dump(result, f, ensure_ascii=False)
        f.write('\n')
```

### Execution Details
- Platform: Google Colaboratory
- Hardware: T4 GPU
- Approximate Inference Time: 9 minutes

## Limitaitons
This model is in early development stages. Outputs may not consistently align with human intent and require careful validation. Potential for generating inappropriate or incorrect responses exists. Recommended for experimental use with human oversight.

## License
This fine-tuned model is released under the CC BY-NC-SA 4.0 license, as it was trained on a dataset covered by the same license. The pre-trained model used as a starting point for fine-tuning is distributed under the Apache License 2.0.

### Data Attribution
This model was fine-tuned using the ichikara-instruction-003-001-1.json dataset, provided by Institute of Physical and Chemical Research.
The dataset is licensed under [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-nc-sa/4.0/).
For more information about the dataset, please visit [Dataset URL](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF-%E5%85%AC%E9%96%8B/).

## Model Card Authors
Mitsuhiro Umizaki