File size: 659 Bytes
adb5454
1589113
adb5454
 
1589113
 
a14e0e7
 
 
a18e039
9ee04a6
a18e039
a14e0e7
 
 
6d75090
cf8a7d5
929f502
647abae
929f502
647abae
6db04db
647abae
6db04db
6d75090
 
cf8a7d5
6db04db
647abae
6db04db
647abae
6db04db
647abae
6db04db
647abae
6db04db
647abae
6db04db
647abae
6db04db
6d75090
 
cf8a7d5
6db04db
647abae
6db04db
647abae
6db04db
647abae
6db04db
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
---
license: cc-by-4.0
datasets:
- kyujinpy/Open-platypus-Commercial
language:
- en
---
**Model Details**

Model Developers: *SeungJin Lee (knlpscience)*

Base Model: *upstage/SOLAR-10.7B-v1.0*

**Notice**

***hyper params I***

-batch_size : 16

-num_epochs : 1

-micro_batch : 1

-gradient_accumulation_steps : batch_size // micro_batch

***hyper params II***

-cutoff_len : 4096

-lr_scheduler : 'cosine'

-warmup_ratio : 0.06

-learning_rate : 4e-4

-optimizer : 'adamw_torch'

-weight_decay : 0.01

-max_grad_norm : 1.0

***LoRA config***

-lora_r : 64

-lora_alpha : 16

-lora_dropout : 0.05

-lora_target_modules : ["gate_proj", "down_proj", "up_proj"]