jianzhnie commited on
Commit
96917c1
·
1 Parent(s): 7f8e5a2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -0
README.md CHANGED
@@ -1,3 +1,43 @@
1
  ---
2
  license: apache-2.0
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ datasets:
4
+ - tatsu-lab/alpaca
5
  ---
6
+
7
+ This repo contains a low-rank adapter for LLaMA-7b
8
+ fit on the [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca) dataset.
9
+
10
+ This version of the weights was trained with the following hyperparameters:
11
+
12
+ - Epochs: 3 (load from best epoch)
13
+ - Batch size: 32
14
+ - Learning rate: 1e-4
15
+ - Lora _r_: 8
16
+ - lora_alpha : 16
17
+ - Lora target modules: q_proj, v_proj
18
+
19
+ That is:
20
+
21
+ ```
22
+ python train_alpaca_lora.py \
23
+ --model_name_or_path decapoda-research/llama-7b-hf \
24
+ --data_path tatsu-lab/alpaca \
25
+ --output_dir work_dir_lora/ \
26
+ --num_train_epochs 3 \
27
+ --per_device_train_batch_size 4 \
28
+ --per_device_eval_batch_size 4 \
29
+ --gradient_accumulation_steps 8 \
30
+ --evaluation_strategy "no" \
31
+ --save_strategy "steps" \
32
+ --save_steps 500 \
33
+ --save_total_limit 5 \
34
+ --learning_rate 1e-4 \
35
+ --weight_decay 0. \
36
+ --warmup_ratio 0.03 \
37
+ --lr_scheduler_type "cosine" \
38
+ --model_max_length 2048 \
39
+ --logging_steps 1 \
40
+ --fp16 True
41
+ ```
42
+
43
+ Instructions for running it can be found at https://github.com/jianzhnie/open-chatgpt.