Spaces:
Runtime error
Runtime error
Commit History
feat(modeling): simplify abstract_init
fa72aa7
feat(train) - handle multiple nodes (#130)
0952927
unverified
feat: handle model parallel
1bb3269
feat(train): more custom x-axis
5f28cd2
feat(train): split artifact into model/state (#128)
7c4c287
unverified
fix: style
386f839
fix(train): opt_state_shape for distributed_shampoo
225b6ff
feat(train): split artifact into model/state
fa5b058
style(tokenizer): remove unused variables
605df32
feat: use fast tokenizer
767d78a
feat(train): another 25% faster
14abe8c
Merge pull request #127 from borisdayma/pjit-t5x
e4401dd
unverified
feat(train): overhead from 70% to 1% 🥳
2b7f5f1
feat(pjit): follow t5x style
7b5868f
fix(train): grads spec
00710bc
feat(train): improve pjit speed
f254058
fix(train): consider correct batch size
b7c7458
feat(train): custom start_preconditioning_step
8149924
feat(train): handle distributed_shampoo in pjit
032f623
feat: update distributed_shampoo + fix None spec
8a9e367
feat(train): distributed_shampoo with pjit
cc34d07
feat(train): use pjit (#125)
f5239e1
unverified
style: unsused import
7a176b9
fix style
f044cb8
feat(train): restore opt_state efficiently
1bfc1b5
feat(model): clean way to load on cpu
12f323d
feat(train): load model on CPU
3d43591
feat(train): different rng per node
2d212d8
feat(train): no batch dimension with pjit
df1fe19
feat(train): progress on pjit
49597a2
feat(train): start pjit support
0081723
feat: use_artifact if run existing
a5ed112
Load from wandb artifact (#121)
f69b21b
unverified
Style (isort).
f9d51f7
Pedro Cuenca
commited on
feat(train): update sweep config
bbbf7c8
Use DalleBartTokenizer. State restoration reverted to previous method:
ae983d7
Pedro Cuenca
commited on
Tokenizer, config, model can be loaded from wandb.
7e48337
Pedro Cuenca
commited on
fix(train): variable not defined
4c87adf
feat(train): cleanup args
a2bf605
Merge pull request #122 from borisdayma/feat-acccum
c91ceb7
unverified
feat(data): support accumulation in non-streaming
88c8e06
refactor(train): cleanup
274ba73
feat: custom gradient accumulation
2d07559
fix: style
df01fa8
feat(train): use MultiSteps for gradient accumulation
4fa53a5
Change import order again.
2b2be9b
Pedro Cuenca
commited on
Fix import order to make isort happy.
64d99b2
Pedro Cuenca
commited on
Accept changes suggested by linter.
9f522b8
Pedro Cuenca
commited on
Update help string for `model_name_or_path`.
290e443
Pedro Cuenca
commited on