chlee10 commited on
Commit
e265ca5
·
verified ·
1 Parent(s): 7c1a9ad

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -0
README.md CHANGED
@@ -1,3 +1,57 @@
1
  ---
 
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ pipeline_tag: text-generation
3
  license: apache-2.0
4
+ language:
5
+ - en
6
+ tags:
7
+ - SOLAR-10.7B-v1.0
8
+ - Open-platypus-Commercial
9
+ base_model: upstage/SOLAR-10.7B-v1.0
10
+ datasets:
11
+ - kyujinpy/Open-platypus-Commercial
12
+ model-index:
13
+ - name: T3Q-platypus-SOLAR-10.7B-v1.0
14
+ results: []
15
  ---
16
+ Update @ 2024.03.07
17
+
18
+ ## T3Q-platypus-SOLAR-10.7B-v1.0
19
+
20
+ This model is a fine-tuned version of upstage/SOLAR-10.7B-v1.0
21
+
22
+ **Model Developers** Chihoon Lee(chlee10), T3Q
23
+
24
+ ## Training hyperparameters
25
+
26
+ The following hyperparameters were used during training:
27
+
28
+ ```python
29
+ # 데이터셋과 훈련 횟수와 관련된 하이퍼 파라미터
30
+ batch_size = 16
31
+ num_epochs = 1
32
+ micro_batch = 1
33
+ gradient_accumulation_steps = batch_size // micro_batch
34
+
35
+ # 훈련 방법에 대한 하이퍼 파라미터
36
+ cutoff_len = 4096
37
+ lr_scheduler = 'cosine'
38
+ warmup_ratio = 0.06 # warmup_steps = 100
39
+ learning_rate = 4e-4
40
+ optimizer = 'adamw_torch'
41
+ weight_decay = 0.01
42
+ max_grad_norm = 1.0
43
+
44
+ # LoRA config
45
+ lora_r = 16
46
+ lora_alpha = 16
47
+ lora_dropout = 0.05
48
+ lora_target_modules = ["gate_proj", "down_proj", "up_proj"]
49
+
50
+ # Tokenizer에서 나오는 input값 설정 옵션
51
+ train_on_inputs = False
52
+ add_eos_token = False
53
+
54
+ # NEFTune params
55
+ noise_alpha: int = 5
56
+ ```
57
+