This is the ELECTRA-Tiny language model that is effective for discriminative tasks with fewer than 6 million parameters. The embeddings were enriched using multimodal embeddings from a multiplex network. It was trained on the BabyLM 100M dataset.
Citation
@inproceedings{fields-etal-2023-tiny, title = "Tiny Language Models Enriched with Multimodal Knowledge from Multiplex Networks", author = "Fields, Clayton and Natouf, Osama and McMains, Andrew and Henry, Catherine and Kennington, Casey", editor = "Warstadt, Alex and Mueller, Aaron and Choshen, Leshem and Wilcox, Ethan and Zhuang, Chengxu and Ciro, Juan and Mosquera, Rafael and Paranjabe, Bhargavi and Williams, Adina and Linzen, Tal and Cotterell, Ryan", booktitle = "Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning", month = dec, year = "2023", address = "Singapore", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2023.conll-babylm.3/", doi = "10.18653/v1/2023.conll-babylm.3", pages = "47--57" } }
Acknowledgements
This material is based upon work supported by the National Science Foundation under Grant No. 2140642.
- Downloads last month
- 3