kattyan commited on
Commit
9ee41db
·
verified ·
1 Parent(s): ec730b6

update readme

Browse files
Files changed (1) hide show
  1. README.md +47 -4
README.md CHANGED
@@ -8,15 +8,58 @@ tags:
8
  - trl
9
  license: apache-2.0
10
  language:
11
- - en
12
  ---
13
-
14
- # Uploaded model
15
 
16
  - **Developed by:** kattyan
17
  - **License:** apache-2.0
18
- - **Finetuned from model :** llm-jp/llm-jp-3-13b
19
 
20
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  - trl
9
  license: apache-2.0
10
  language:
11
+ - ja
12
  ---
13
+ # Uploaded Model
 
14
 
15
  - **Developed by:** kattyan
16
  - **License:** apache-2.0
17
+ - **Finetuned from model:** llm-jp/llm-jp-3-13b
18
 
19
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
20
 
21
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
22
+
23
+ # Required Libraries and Their Versions
24
+
25
+ - torch>=2.3.0
26
+ - transformers>=4.40.1
27
+ - tokenizers>=0.19.1
28
+ - accelerate>=0.29.3
29
+ - flash-attn>=2.5.8
30
+
31
+ # Usage
32
+
33
+ ```python
34
+ from unsloth import FastLanguageModel
35
+
36
+ model_name = "llm-jp/llm-jp-3-13b" # モデル名
37
+ max_seq_length = 512 # 最大シーケンス長
38
+ dtype = None # データ型(None で自動設定)
39
+ load_in_4bit = True # 4bit量子化を使用
40
+
41
+ # モデルとトークナイザーのロード
42
+ model, tokenizer = FastLanguageModel.from_pretrained(
43
+ model_name=model_name,
44
+ max_seq_length=max_seq_length,
45
+ dtype=dtype,
46
+ load_in_4bit=load_in_4bit,
47
+ token="YOUR_HUGGING_FACE_TOKEN", # Hugging Face トークンを指定
48
+ )
49
+
50
+ # 推論用にモデルを準備
51
+ FastLanguageModel.for_inference(model)
52
+
53
+ # プロンプトの設定
54
+ prompt = "LLMとはなんですか?"
55
+
56
+ # トークナイザーで入力をエンコード
57
+ inputs = tokenizer([prompt], return_tensors="pt").to(model.device)
58
+
59
+ # モデルで生成を行う
60
+ outputs = model.generate(**inputs, max_new_tokens=512, use_cache=True, do_sample=False, repetition_penalty=1.2)
61
+
62
+ # 出力のデコード
63
+ prediction = tokenizer.decode(outputs[0], skip_special_tokens=True).split('\n### 回答')[-1]
64
+ print(prediction)
65
+ ```