Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,7 @@
|
|
1 |
---
|
2 |
license: cc-by-nc-4.0
|
|
|
|
|
3 |
---
|
4 |
|
5 |
# Model card for `boldgpt_small_patch10.kmq`
|
@@ -31,3 +33,17 @@ batch["activity"] = transform(batch["activity"])
|
|
31 |
# output: (B, N + 1, K) predicted next token logits
|
32 |
output, state = model(batch)
|
33 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: cc-by-nc-4.0
|
3 |
+
datasets:
|
4 |
+
- clane9/NSD-Flat
|
5 |
---
|
6 |
|
7 |
# Model card for `boldgpt_small_patch10.kmq`
|
|
|
33 |
# output: (B, N + 1, K) predicted next token logits
|
34 |
output, state = model(batch)
|
35 |
```
|
36 |
+
|
37 |
+
## Reproducing
|
38 |
+
|
39 |
+
- Training command:
|
40 |
+
|
41 |
+
```bash
|
42 |
+
torchrun --standalone --nproc_per_node=4 \
|
43 |
+
scripts/train_gpt.py --out_dir results \
|
44 |
+
--model boldgpt_small \
|
45 |
+
--ps 10 --vs 1024 --vocab_state checkpoints/ps-10_vs-1024_vss-4000_seed-42/tok_state.pt \
|
46 |
+
--shuffle --epochs 1000 --bs 512 \
|
47 |
+
--workers 0 --amp --compile --wandb
|
48 |
+
```
|
49 |
+
- Commit: `f9720ca52d6fa6b3eb47a34cf95f8e18a8683e4c`
|