datasets: | |
- PatrickHaller/dsir-pile-100M-words | |
language: | |
- en | |
library_name: transformers | |
Our model for the 2024 BabyLM challenge 100M words track. | |
To download and use this model the [fla](https://github.com/sustcsonglin/flash-linear-attention) package has to be installed: | |
```bash | |
pip install -U git+https://github.com/sustcsonglin/flash-linear-attention | |
``` |