pszemraj
/

ModernBERT2gpt2-700m-v0.1

Text2Text Generation

encoder-decoder

Model card Files Files and versions Community

pszemraj commited on Jan 28

Commit

8823e0c

·

verified ·

1 Parent(s): 549e518

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -11,10 +11,11 @@ datasets:
 # ModernBERT2gpt2-700m baseline
-EncoderDecoder created from modernBERT-large and random-init `gpt2` trained on the pszemraj/t2t-re_pretrain-small dataset for one epoch as a "baseline"
 - input context length 2048
 - output context length 512
 It achieves the following results on the evaluation set:
 - Loss: 2.2113

 # ModernBERT2gpt2-700m baseline
+EncoderDecoder created from modernBERT-large and random-init `gpt2` trained on the pszemraj/t2t-re_pretrain-small dataset for one epoch as a "baseline". Logs and training script can be found [on wandb](https://wandb.ai/pszemraj/enc-dec-modernbert-olmo/runs/xpg9wjco)
 - input context length 2048
 - output context length 512
+- single tokenizer, slighly modified from modernBERT
 It achieves the following results on the evaluation set:
 - Loss: 2.2113