Update README.md
Browse files
README.md
CHANGED
@@ -69,31 +69,6 @@ Repetition Penalty: 1.1
|
|
69 |
Custom Stopping Strings: "\n{{user}}", "<" , "```" , -> Has occasional broken generations.
|
70 |
```
|
71 |
|
72 |
-
|
73 |
-
## Training
|
74 |
-
|
75 |
-
These are the key hyperparameters used during training:
|
76 |
-
|
77 |
-
| Hyperparameters | Finetuning |
|
78 |
-
|-------------------------------|----------------------------|
|
79 |
-
| **Hardware** | 4x Nvidia L40 48GB |
|
80 |
-
| **Batch Size** | 4x 2 |
|
81 |
-
| **Gradient Accumulation Steps** | 4x 3 |
|
82 |
-
| **LoRA Rank** | 32 |
|
83 |
-
| **LoRA Alpha** | 64 |
|
84 |
-
| **LoRA Dropout** | 0.04 |
|
85 |
-
| **Seq_Length** | 8192 |
|
86 |
-
| **LoRA Target Layers** | All Linear Layers |
|
87 |
-
| **Epochs** | 2 |
|
88 |
-
| **Max Learning Rate** | 2e-4 |
|
89 |
-
| **Min Learning Rate** | 4e-5 |
|
90 |
-
| **Optimizer** | adamw_bnb_8bit |
|
91 |
-
| **Optimizer Args** | Warmup: True | Steps: 20
|
92 |
-
| **Scheduler** | cosine_with_min_lr |
|
93 |
-
| **Warmup Steps** | 4% |
|
94 |
-
|
95 |
## License
|
96 |
|
97 |
Nephra v1 falls under [META LLAMA 3 COMMUNITY LICENSE AGREEMENT](https://huggingface.co/meta-llama/Meta-Llama-3-8B/blob/main/LICENSE).
|
98 |
-
|
99 |
-
to
|
|
|
69 |
Custom Stopping Strings: "\n{{user}}", "<" , "```" , -> Has occasional broken generations.
|
70 |
```
|
71 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
72 |
## License
|
73 |
|
74 |
Nephra v1 falls under [META LLAMA 3 COMMUNITY LICENSE AGREEMENT](https://huggingface.co/meta-llama/Meta-Llama-3-8B/blob/main/LICENSE).
|
|
|
|