MicroPanda123
commited on
Commit
·
9ff3927
1
Parent(s):
850ba12
Update README.md
Browse files
README.md
CHANGED
@@ -14,9 +14,9 @@ gradient_accumulation_steps = 64
|
|
14 |
```
|
15 |
This was because I was training it locally on RTX2060 and did not have enough power to train it on higher settings.
|
16 |
Model is stored in "model" folder that contains model itself and "info.txt" file containing:
|
17 |
-
iter_num - number of iterations
|
18 |
-
train_loss - training loss at time of checkpoint
|
19 |
-
val_loss - validation loss at time of checkpoint
|
20 |
-
config - nanoGPT config
|
21 |
|
22 |
At first I made it only save model after validation loss improved, to not allow overfitting, but after some time I decided to risk it and turned that off and allowed it to save everytime, luckly it worked out fine.
|
|
|
14 |
```
|
15 |
This was because I was training it locally on RTX2060 and did not have enough power to train it on higher settings.
|
16 |
Model is stored in "model" folder that contains model itself and "info.txt" file containing:
|
17 |
+
- iter_num - number of iterations
|
18 |
+
- train_loss - training loss at time of checkpoint
|
19 |
+
- val_loss - validation loss at time of checkpoint
|
20 |
+
- config - nanoGPT config
|
21 |
|
22 |
At first I made it only save model after validation loss improved, to not allow overfitting, but after some time I decided to risk it and turned that off and allowed it to save everytime, luckly it worked out fine.
|