asaoka commited on
Commit
11e2e46
1 Parent(s): 614996e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +78 -2
README.md CHANGED
@@ -1,9 +1,56 @@
1
  ---
2
  library_name: peft
3
  ---
4
- ## Training procedure
5
 
 
6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  The following `bitsandbytes` quantization config was used during training:
8
  - quant_method: bitsandbytes
9
  - load_in_8bit: False
@@ -15,7 +62,36 @@ The following `bitsandbytes` quantization config was used during training:
15
  - bnb_4bit_quant_type: nf4
16
  - bnb_4bit_use_double_quant: True
17
  - bnb_4bit_compute_dtype: bfloat16
18
- ### Framework versions
19
 
 
20
 
21
  - PEFT 0.5.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  library_name: peft
3
  ---
 
4
 
5
+ # モデル概要
6
 
7
+ [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)を日本語データ([taka-yayoi/databricks-dolly-15k-ja](https://huggingface.co/datasets/taka-yayoi/databricks-dolly-15k-ja))を用いてインストラクションチューニングしました.
8
+
9
+ # 使用方法
10
+
11
+ ```python
12
+ import torch
13
+ from peft import PeftModel
14
+ from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
15
+ # モデルの読み込み
16
+ model = AutoModelForCausalLM.from_pretrained(
17
+ "meta-llama/Llama-2-7b-hf",
18
+ quantization_config=BitsAndBytesConfig(
19
+ load_in_4bit=True,
20
+ bnb_4bit_use_double_quant=True,
21
+ bnb_4bit_quant_type="nf4",
22
+ bnb_4bit_compute_dtype=torch.bfloat16
23
+ ),
24
+ device_map={"":0}
25
+ )
26
+ # トークナイザーの読み込み
27
+ tokenizer = AutoTokenizer.from_pretrained(
28
+ "asaoka/Llama-2-7b-hf-qlora-dolly15k-japanese"
29
+ )
30
+ # LoRAの読み込み
31
+ model = PeftModel.from_pretrained(
32
+ model,
33
+ "asaoka/Llama-2-7b-hf-qlora-dolly15k-japanese",
34
+ device_map={"":0}
35
+ )
36
+ model.eval()
37
+ # プロンプトの準備
38
+ prompt = "### Instruction: 富士山とは?\n\n### Response: "
39
+ # 推論の実行
40
+ inputs = tokenizer(prompt, return_tensors="pt").to("cuda:0")
41
+ with torch.no_grad():
42
+ outputs = model.generate(**inputs, max_new_tokens=100)
43
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
44
+ ```
45
+
46
+ 使用方法は,[「Google Colab で Llama-2-7B のQLoRA ファインチューニングを試す」](https://note.com/npaka/n/na7c631175111#f2af0e53-4ef3-4288-b152-6524f1b940a7)を参照しました.
47
+
48
+ # トレーニング方法
49
+
50
+ - インストラクションチューニング + QLoRA(4bitLoRA)
51
+
52
+ - トークナイザー:Llama-2-7b-hfのトークナイザーをそのまま使用
53
+
54
  The following `bitsandbytes` quantization config was used during training:
55
  - quant_method: bitsandbytes
56
  - load_in_8bit: False
 
62
  - bnb_4bit_quant_type: nf4
63
  - bnb_4bit_use_double_quant: True
64
  - bnb_4bit_compute_dtype: bfloat16
 
65
 
66
+ ### Framework versions
67
 
68
  - PEFT 0.5.0
69
+
70
+ # JGLUEスコア
71
+
72
+ | タスク | Llama-2-7b-hf | This Model |
73
+ |:-|:-|:-|
74
+ | jcommonsenseqa-1.1-0.6(acc) | 0.7274 | ? |
75
+
76
+ [JGLUEスコア](https://aclanthology.org/2022.lrec-1.317/)は,Stability AI社の[lm-evaluation-harness](https://github.com/Stability-AI/lm-evaluation-harness)を用いて
77
+ 算出しました.JGLUEスコアの算出に用いたスクリプトを下記に示します.
78
+
79
+ ```bash
80
+ !python main.py \
81
+ --model hf-causal-experimental \
82
+ --model_args pretrained=meta-llama/Llama-2-7b-hf \
83
+ --tasks jcommonsenseqa-1.1-0.6 \
84
+ --num_fewshot 3 \
85
+ --device cuda \
86
+ --output_path ./results.json
87
+ ```
88
+
89
+ ```bash
90
+ !python main.py \
91
+ --model hf-causal-experimental \
92
+ --model_args pretrained=meta-llama/Llama-2-7b-hf,peft=asaoka/Llama-2-7b-hf-qlora-dolly15k-japanese \
93
+ --tasks jcommonsenseqa-1.1-0.6 \
94
+ --num_fewshot 3 \
95
+ --device cuda \
96
+ --output_path ./results.json
97
+ ```