AI & ML interests

None defined yet.

Recent Activity

StepLaw 's collections 41

StepLaw-N_59M-D_7.0B
Models with 59M parameters trained with 7.0B tokens.
StepLaw-N_59M-D_1.0B
Models with 59M parameters trained with 1.0B tokens.
StepLaw-N_536M-D_7.0B
Models with 536M parameters trained with 7.0B tokens.
StepLaw-N_536M-D_3.0B
Models with 536M parameters trained with 3.0B tokens.
StepLaw-N_536M-D_19.0B
Models with 536M parameters trained with 19.0B tokens.
StepLaw-N_429M-D_99.0B
Models with 429M parameters trained with 99.0B tokens.
StepLaw-N_429M-D_49.0B
Models with 429M parameters trained with 49.0B tokens.
StepLaw-N_429M-D_3.0B
Models with 429M parameters trained with 3.0B tokens.
StepLaw-N_429M-D_19.0B
Models with 429M parameters trained with 19.0B tokens.
StepLaw-N_268M-D_99.0B
Models with 268M parameters trained with 99.0B tokens.
StepLaw-N_268M-D_7.0B
Models with 268M parameters trained with 7.0B tokens.
StepLaw-N_268M-D_3.0B
Models with 268M parameters trained with 3.0B tokens.
StepLaw-N_268M-D_19.0B
Models with 268M parameters trained with 19.0B tokens.
StepLaw-N_268M-D_1.0B
Models with 268M parameters trained with 1.0B tokens.
StepLaw-N_214M-D_7.0B
Models with 214M parameters trained with 7.0B tokens.
StepLaw-N_214M-D_3.0B
Models with 214M parameters trained with 3.0B tokens.
StepLaw-N_214M-D_1.0B
Models with 214M parameters trained with 1.0B tokens.
StepLaw-N_119M-D_3.0B
Models with 119M parameters trained with 3.0B tokens.
StepLaw-N_1.0B-D_7.0B
Models with 1.0B parameters trained with 7.0B tokens.
StepLaw-N_1.0B-D_3.0B
Models with 1.0B parameters trained with 3.0B tokens.
StepLaw-N_1.0B-D_1.0B
Models with 1.0B parameters trained with 1.0B tokens. Architecture: H=2048, FFN=8192, Heads=16, Layers=16.
StepLaw-N_59M-D_3.0B
Models with 59M parameters trained with 3.0B tokens.
StepLaw-N_536M-D_9.0B
Models with 536M parameters trained with 9.0B tokens.
StepLaw-N_536M-D_49.0B
Models with 536M parameters trained with 49.0B tokens.
StepLaw-N_536M-D_28.0B
Models with 536M parameters trained with 28.0B tokens.
StepLaw-N_536M-D_1.0B
Models with 536M parameters trained with 1.0B tokens.
StepLaw-N_429M-D_7.0B
Models with 429M parameters trained with 7.0B tokens.
StepLaw-N_429M-D_39.0B
Models with 429M parameters trained with 39.0B tokens.
StepLaw-N_429M-D_22.0B
Models with 429M parameters trained with 22.0B tokens.
StepLaw-N_429M-D_1.0B
Models with 429M parameters trained with 1.0B tokens.
StepLaw-N_268M-D_79.0B
Models with 268M parameters trained with 79.0B tokens.
StepLaw-N_268M-D_4.0B
Models with 268M parameters trained with 4.0B tokens.
StepLaw-N_268M-D_24.0B
Models with 268M parameters trained with 24.0B tokens.
StepLaw-N_268M-D_14.0B
Models with 268M parameters trained with 14.0B tokens.
StepLaw-N_214M-D_99.0B
Models with 214M parameters trained with 99.0B tokens.
StepLaw-N_214M-D_19.0B
Models with 214M parameters trained with 19.0B tokens.
StepLaw-N_214M-D_11.0B
Models with 214M parameters trained with 11.0B tokens.
StepLaw-N_119M-D_7.0B
Models with 119M parameters trained with 7.0B tokens.
StepLaw-N_119M-D_1.0B
Models with 119M parameters trained with 1.0B tokens.
StepLaw-N_1.0B-D_56.0B
Models with 1.0B parameters trained with 56.0B tokens.
StepLaw-N_1.0B-D_19.0B
Models with 1.0B parameters trained with 19.0B tokens.
StepLaw-N_59M-D_7.0B
Models with 59M parameters trained with 7.0B tokens.
StepLaw-N_59M-D_3.0B
Models with 59M parameters trained with 3.0B tokens.
StepLaw-N_59M-D_1.0B
Models with 59M parameters trained with 1.0B tokens.
StepLaw-N_536M-D_9.0B
Models with 536M parameters trained with 9.0B tokens.
StepLaw-N_536M-D_7.0B
Models with 536M parameters trained with 7.0B tokens.
StepLaw-N_536M-D_49.0B
Models with 536M parameters trained with 49.0B tokens.
StepLaw-N_536M-D_3.0B
Models with 536M parameters trained with 3.0B tokens.
StepLaw-N_536M-D_28.0B
Models with 536M parameters trained with 28.0B tokens.
StepLaw-N_536M-D_19.0B
Models with 536M parameters trained with 19.0B tokens.
StepLaw-N_536M-D_1.0B
Models with 536M parameters trained with 1.0B tokens.
StepLaw-N_429M-D_99.0B
Models with 429M parameters trained with 99.0B tokens.
StepLaw-N_429M-D_7.0B
Models with 429M parameters trained with 7.0B tokens.
StepLaw-N_429M-D_49.0B
Models with 429M parameters trained with 49.0B tokens.
StepLaw-N_429M-D_39.0B
Models with 429M parameters trained with 39.0B tokens.
StepLaw-N_429M-D_3.0B
Models with 429M parameters trained with 3.0B tokens.
StepLaw-N_429M-D_22.0B
Models with 429M parameters trained with 22.0B tokens.
StepLaw-N_429M-D_19.0B
Models with 429M parameters trained with 19.0B tokens.
StepLaw-N_429M-D_1.0B
Models with 429M parameters trained with 1.0B tokens.
StepLaw-N_268M-D_99.0B
Models with 268M parameters trained with 99.0B tokens.
StepLaw-N_268M-D_79.0B
Models with 268M parameters trained with 79.0B tokens.
StepLaw-N_268M-D_7.0B
Models with 268M parameters trained with 7.0B tokens.
StepLaw-N_268M-D_4.0B
Models with 268M parameters trained with 4.0B tokens.
StepLaw-N_268M-D_3.0B
Models with 268M parameters trained with 3.0B tokens.
StepLaw-N_268M-D_24.0B
Models with 268M parameters trained with 24.0B tokens.
StepLaw-N_268M-D_19.0B
Models with 268M parameters trained with 19.0B tokens.
StepLaw-N_268M-D_14.0B
Models with 268M parameters trained with 14.0B tokens.
StepLaw-N_268M-D_1.0B
Models with 268M parameters trained with 1.0B tokens.
StepLaw-N_214M-D_99.0B
Models with 214M parameters trained with 99.0B tokens.
StepLaw-N_214M-D_7.0B
Models with 214M parameters trained with 7.0B tokens.
StepLaw-N_214M-D_19.0B
Models with 214M parameters trained with 19.0B tokens.
StepLaw-N_214M-D_3.0B
Models with 214M parameters trained with 3.0B tokens.
StepLaw-N_214M-D_11.0B
Models with 214M parameters trained with 11.0B tokens.
StepLaw-N_214M-D_1.0B
Models with 214M parameters trained with 1.0B tokens.
StepLaw-N_119M-D_7.0B
Models with 119M parameters trained with 7.0B tokens.
StepLaw-N_119M-D_3.0B
Models with 119M parameters trained with 3.0B tokens.
StepLaw-N_119M-D_1.0B
Models with 119M parameters trained with 1.0B tokens.
StepLaw-N_1.0B-D_7.0B
Models with 1.0B parameters trained with 7.0B tokens.
StepLaw-N_1.0B-D_56.0B
Models with 1.0B parameters trained with 56.0B tokens.
StepLaw-N_1.0B-D_3.0B
Models with 1.0B parameters trained with 3.0B tokens.
StepLaw-N_1.0B-D_19.0B
Models with 1.0B parameters trained with 19.0B tokens.
StepLaw-N_1.0B-D_1.0B
Models with 1.0B parameters trained with 1.0B tokens. Architecture: H=2048, FFN=8192, Heads=16, Layers=16.