venkatasg
/

lil-bevo

Inference Endpoints

Model card Files Files and versions Community

venkatasg commited on Jul 21, 2023

Commit

61a0aba

·

1 Parent(s): 2b58279

First commit with tldr of model

Files changed (1) hide show

README.md +20 -0

README.md CHANGED Viewed

@@ -1,3 +1,23 @@
 ---
 license: mit
 ---

 ---
 license: mit
+language:
+- en
+tags:
+- babylm
 ---
+# Lil-Bevo
+Lil-Bevo is UT Austin's submission to the BabyLM challenge, specifically the *strict-small* track.
+[Link to GitHub Repo](https://github.com/venkatasg/Lil-Bevo)
+## TLDR:
+- Unigram tokenizer trained on 10M BabyLM tokens plus MAESTRO dataset for a vocab size of 16k.
+- `deberta-small-v3` trained on mixture of MAESTRO and 10M tokens for 3 epochs.
+- Model continues training for 50 epochs on 10M tokens with 128 sequence length.
+- Model continues training for 200 epochs on 10M tokens with 512 sequence length.
+- Model is trained with targeted linguistic masking for 10 epochs.
+  This README will be updated with more details soon.