PatrickHaller's picture
Update README.md
b3da54b verified
metadata
datasets:
  - PatrickHaller/dsir-pile-100M-words
language:
  - en
library_name: transformers

Our model for the 2024 BabyLM challenge 100M words track.

To download and use this model the fla package has to be installed:

pip install -U git+https://github.com/sustcsonglin/flash-linear-attention