File size: 884 Bytes
7b30013
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58d8b5f
 
ba68bb9
58d8b5f
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
---
license: mit
datasets:
- omarmomen/babylm_10M
language:
- en
metrics:
- perplexity
library_name: transformers
---
# Model Card for omarmomen/structroberta_s1_final

This model is part of the experiments in the published paper at the BabyLM workshop in CoNLL 2023. 
The paper titled "Increasing The Performance of Cognitively Inspired Data-Efficient Language Models via Implicit Structure Building" (https://aclanthology.org/2023.conll-babylm.29/)

<strong>omarmomen/structroberta_s1_final</strong> is a modification on the Roberta Model to incorporate syntactic inductive bias using an unsupervised parsing mechanism.

This model variant places the parser network ahead of all attention blocks.

The model is pretrained on the BabyLM 10M dataset using a custom pretrained RobertaTokenizer (https://huggingface.co/omarmomen/babylm_tokenizer_32k).


https://arxiv.org/abs/2310.20589