something-else
commited on
Commit
•
9e12a15
1
Parent(s):
eadf500
Update README.md
Browse files
README.md
CHANGED
@@ -44,3 +44,11 @@ tags:
|
|
44 |
- rwkv-9Q-16k-step6-0-4.pth: Using rwkv-9Q-4k-stp248.pth I added N-0 and N-8 and a Ctx=16384 loss=1.65. This model looks that can chat better.
|
45 |
- rwkv-9Q-step607-N8-3.pth: Using rwkv-9Q-16k-step6-0-4.pth I add 100G tokens of N8-3.
|
46 |
- rwkv-9Q-4k-stp662-N8-3.pth: Using rwkv-9Q-step607-N8-3.pth I added 10G tokes more of N8-3.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
44 |
- rwkv-9Q-16k-step6-0-4.pth: Using rwkv-9Q-4k-stp248.pth I added N-0 and N-8 and a Ctx=16384 loss=1.65. This model looks that can chat better.
|
45 |
- rwkv-9Q-step607-N8-3.pth: Using rwkv-9Q-16k-step6-0-4.pth I add 100G tokens of N8-3.
|
46 |
- rwkv-9Q-4k-stp662-N8-3.pth: Using rwkv-9Q-step607-N8-3.pth I added 10G tokes more of N8-3.
|
47 |
+
|
48 |
+
V6 models:
|
49 |
+
|
50 |
+
|
51 |
+
6B rocm-rwkv pth record: 12 layers embd=6144 ctx=4096.
|
52 |
+
|
53 |
+
- rwkv-6B-N3-final.pth: 6B rocm-rwkv model trained with N3 with a final loss=3.56 after 100G Tokens
|
54 |
+
- rwkv-6B-N0-final.pth: starting from the previous pth rocm-rwkv trained with N0 with a final loss=3.11 after 100G Tokens
|