datasets: | |
- allenai/c4 | |
language: | |
- en | |
library_name: transformers | |
license: apache-2.0 | |
## Model Description | |
This is a model using the llama2 architecture and only 30 million parameters. It is trained on approximately 2 billion tokens of diverse web data from the first 1000000 rows of the uncleaned c4 english dataset. | |