Our model for the 2024 BabyLM challenge 100M words track.

To download and use this model the fla package has to be installed:

pip install -U git+https://github.com/sustcsonglin/flash-linear-attention
Downloads last month
5,265
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support model that require custom code execution.

Dataset used to train PatrickHaller/hgrn2_pile_100m_distill_babylm

Collection including PatrickHaller/hgrn2_pile_100m_distill_babylm