Spaces:
Runtime error
Runtime error
Commit History
Merge branch 'main' of https://github.com/borisdayma/dalle-mini into main
bcd360f
feat: better multi-node support (#158)
728a3c3
unverified
feat(text): support emojis (#154)
7ef7bd9
unverified
fix: smelu
7f2f8ed
fix: sinkformer
2c583b3
fix: support smelu
a2dcee4
feat: allow relative position (#156)
769d20a
unverified
feat: sinkhorn in lse mode (#155)
00d4661
unverified
fix: sinkformer gradient
eed4896
feat(model): allow bias (#152)
361a994
unverified
feat: add sinkformer + custom final ln + pre-ln (#151)
f139b0b
unverified
feat: placeholders for more config
69bcbeb
feat: force final ln in encoder
32f4ba5
feat: allow more configurations
5bd4c20
fix: DeepNet doesn't scale weights of embedding/output layers (#150)
503d6b4
unverified
Shuming Ma
Shuming Ma
commited on
feat: remove unecessary LN
02824a7
feat: add cogview
472c4cc
fix(textnormalizer): consider utf8 on windows (#148)
3b8d8cb
unverified
illtellyoulater
commited on
feat: implement transformer variants (#144)
542378c
unverified
feat(data): super conditioning (#141)
7939874
unverified
feat: support pod (#139)
803ccbf
unverified
feat: handle gradient checkpointing
5173ec7
feat: load from bucket
1c4e839
feat: reduce artifact space + offset step
34cf91c
feat: restore weights on CPU
5f954fc
fix: position embedding for generate method
ebac379
fix: typo
68cc185
fix: load from checkpoint
44b7c3e
feat(modeling): simplify abstract_init
fa72aa7
feat(train) - handle multiple nodes (#130)
0952927
unverified
feat: handle model parallel
1bb3269
fix: style
386f839
style(tokenizer): remove unused variables
605df32
feat: use fast tokenizer
767d78a
feat(train): improve pjit speed
f254058
fix(train): consider correct batch size
b7c7458
feat(train): distributed_shampoo with pjit
cc34d07
style: unsused import
7a176b9
feat(model): clean way to load on cpu
12f323d
feat(train): no batch dimension with pjit
df1fe19
feat(train): progress on pjit
49597a2
feat: use_artifact if run existing
a5ed112
Load from wandb artifact (#121)
f69b21b
unverified
Style (isort).
f9d51f7
Pedro Cuenca
commited on
Tokenizer, config, model can be loaded from wandb.
7e48337
Pedro Cuenca
commited on
feat(data): support accumulation in non-streaming
88c8e06
feat: custom gradient accumulation
2d07559
Change import order again.
2b2be9b
Pedro Cuenca
commited on
Fix import order to make isort happy.
64d99b2
Pedro Cuenca
commited on