Model Card for omarmomen/structformer_s2_final_with_pos

This model is part of the experiments in the published paper at the BabyLM workshop in CoNLL 2023. The paper titled "Increasing The Performance of Cognitively Inspired Data-Efficient Language Models via Implicit Structure Building" (https://aclanthology.org/2023.conll-babylm.29/)

omarmomen/structformer_s2_final_with_pos is a modification of the vanilla transformer encoder to incorporate syntactic inductive bias using an unsupervised parsing mechanism.

This model variant places the parser network after 4 attention blocks.

The model is pretrained on the BabyLM 10M dataset using a custom pretrained RobertaTokenizer (https://huggingface.co/omarmomen/babylm_tokenizer_32k).

https://arxiv.org/abs/2310.20589

Downloads last month
10
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support model that require custom code execution.

Dataset used to train omarmomen/structformer_s2_final_with_pos