JonasGeiping
/

crammed-bert-legacy

Model card Files Files and versions

JonasGeiping commited on Jun 13, 2023

Commit

285bab3

·

1 Parent(s): ab7a979

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ tags:
 # crammed BERT (legacy/v1)
-This is one of the final models described in the **FIRST VERSION OF** "Cramming: Training a Language Model on a Single GPU in One Day". This is an *English*-language model pretrained like BERT, but with less compute. This one was trained for 24 hours on a single A6000 GPU. To use this model, you need the code from the repo at https://github.com/JonasGeiping/cramming.
 You can find the paper here (linked to the old version on arxiv): https://arxiv.org/abs/2212.14034/v1, and the abstract below:

 # crammed BERT (legacy/v1)
+This is one of the final models described in the **FIRST VERSION OF** "Cramming: Training a Language Model on a Single GPU in One Day". This is an *English*-language model pretrained like BERT, but with less compute. This one was trained for 24 hours on a single A6000 GPU. To use this model, you need the code from the repo at https://github.com/JonasGeiping/cramming tagged v1.13.
 You can find the paper here (linked to the old version on arxiv): https://arxiv.org/abs/2212.14034/v1, and the abstract below: