something-else
commited on
Commit
•
d1d4eb4
1
Parent(s):
84737d8
Update README.md
Browse files
README.md
CHANGED
@@ -21,4 +21,5 @@ tags:
|
|
21 |
- rwkv-v5-stp32-N8.pth : 3B rocm-rwkv model starting with the previous but now with 32 epochs of N8 dataset with --lr_init 7e-6 --lr_final 7e-6. This pth has a loss of 1.810 for N8 and 22.46 GTokens.
|
22 |
- rwkv-v5-stp46-N8.pth : 3B rocm-rwkv model starting with the previous but now with 46 epochs of N8 dataset with --lr_init 7e-6 --lr_final 7e-6. This pth has a loss of 1.800 for N8 and 31.874 GTokens.
|
23 |
- rwkv-v5-stp62-N8.pth : 3B rocm-rwkv model starting with the previous but now with 62 epochs of N8 dataset with --lr_init 7e-6 --lr_final 7e-6. This pth has a loss of 1.790 for N8 and 42.538 GTokens.
|
24 |
-
- rwkv-v5-stp76-N8.pth : 3B rocm-rwkv model starting with the previous but now with 62 epochs of N8 dataset with --lr_init 7e-6 --lr_final 7e-6. This pth has a loss of 1.780 for N8 and 51.763 GTokens.
|
|
|
|
21 |
- rwkv-v5-stp32-N8.pth : 3B rocm-rwkv model starting with the previous but now with 32 epochs of N8 dataset with --lr_init 7e-6 --lr_final 7e-6. This pth has a loss of 1.810 for N8 and 22.46 GTokens.
|
22 |
- rwkv-v5-stp46-N8.pth : 3B rocm-rwkv model starting with the previous but now with 46 epochs of N8 dataset with --lr_init 7e-6 --lr_final 7e-6. This pth has a loss of 1.800 for N8 and 31.874 GTokens.
|
23 |
- rwkv-v5-stp62-N8.pth : 3B rocm-rwkv model starting with the previous but now with 62 epochs of N8 dataset with --lr_init 7e-6 --lr_final 7e-6. This pth has a loss of 1.790 for N8 and 42.538 GTokens.
|
24 |
+
- rwkv-v5-stp76-N8.pth : 3B rocm-rwkv model starting with the previous but now with 62 epochs of N8 dataset with --lr_init 7e-6 --lr_final 7e-6. This pth has a loss of 1.780 for N8 and 51.763 GTokens.
|
25 |
+
- rwkv-v5-stp118-N8.pth : 3B rocm-rwkv model starting with the previous but now with 118 epochs of N8 dataset with --lr_init 7e-6 --lr_final 7e-6. This pth has a loss of 1.750 for N8 and 79.508 GTokens.
|