Update README.md
Browse files
README.md
CHANGED
@@ -11,10 +11,11 @@ datasets:
|
|
11 |
|
12 |
# ModernBERT2gpt2-700m baseline
|
13 |
|
14 |
-
EncoderDecoder created from modernBERT-large and random-init `gpt2` trained on the pszemraj/t2t-re_pretrain-small dataset for one epoch as a "baseline"
|
15 |
|
16 |
- input context length 2048
|
17 |
- output context length 512
|
|
|
18 |
|
19 |
It achieves the following results on the evaluation set:
|
20 |
- Loss: 2.2113
|
|
|
11 |
|
12 |
# ModernBERT2gpt2-700m baseline
|
13 |
|
14 |
+
EncoderDecoder created from modernBERT-large and random-init `gpt2` trained on the pszemraj/t2t-re_pretrain-small dataset for one epoch as a "baseline". Logs and training script can be found [on wandb](https://wandb.ai/pszemraj/enc-dec-modernbert-olmo/runs/xpg9wjco)
|
15 |
|
16 |
- input context length 2048
|
17 |
- output context length 512
|
18 |
+
- single tokenizer, slighly modified from modernBERT
|
19 |
|
20 |
It achieves the following results on the evaluation set:
|
21 |
- Loss: 2.2113
|