something-else
commited on
Commit
•
a1114d3
1
Parent(s):
a10c137
Update README.md
Browse files
README.md
CHANGED
@@ -11,5 +11,5 @@ tags:
|
|
11 |
- rwkv-final-chnk5.pth: 3B rocm-rwkv model trained with Slim pajama chunk1-5 and with a loss of 2.456.
|
12 |
- rwkv-final-chnk17.pth: 3B rocm-rwkv model trained with Slim pajama chunk1-10 for the first epoch and an aditional training with chunk1-7 after the first epoch and with a loss of 2.281
|
13 |
- rwkv-code39-16012024.pth: 3B rocm-rwkv model trained with Slim pajama chunk1-10 for the first epoch and an aditional training with chunk1-8 after the first epoch; plus a little bit of code. This pth has a loss of 1.174 for code alone and 2.26 for text.
|
14 |
-
- rwkv-HHMIX-63x1-47-29012024.pth: 3B rocm-rwkv model starting with rwkv-code39-16012024.pth plus a mix of multi-language and code. This model
|
15 |
-
- rwkv-coder-63x1-104-29012024.pth: 3B rocm-rwkv model starting with rwkv-HHMIX-63x1-47-29012024.pth plus more code (71.21 Gtokens of code).
|
|
|
11 |
- rwkv-final-chnk5.pth: 3B rocm-rwkv model trained with Slim pajama chunk1-5 and with a loss of 2.456.
|
12 |
- rwkv-final-chnk17.pth: 3B rocm-rwkv model trained with Slim pajama chunk1-10 for the first epoch and an aditional training with chunk1-7 after the first epoch and with a loss of 2.281
|
13 |
- rwkv-code39-16012024.pth: 3B rocm-rwkv model trained with Slim pajama chunk1-10 for the first epoch and an aditional training with chunk1-8 after the first epoch; plus a little bit of code. This pth has a loss of 1.174 for code alone and 2.26 for text.
|
14 |
+
- rwkv-HHMIX-63x1-47-29012024.pth: 3B rocm-rwkv model starting with rwkv-code39-16012024.pth plus a mix of multi-language and code. This model has a loss value of 2.065 for the code+multilingual dataset.
|
15 |
+
- rwkv-coder-63x1-104-29012024.pth: 3B rocm-rwkv model starting with rwkv-HHMIX-63x1-47-29012024.pth plus more code (71.21 Gtokens of code). This model has a loss value of 1.090 for the code dataset.
|