Update README.md
Browse files
README.md
CHANGED
@@ -3,4 +3,16 @@ datasets:
|
|
3 |
- GregSamek/TinyNews
|
4 |
language:
|
5 |
- en
|
6 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
- GregSamek/TinyNews
|
4 |
language:
|
5 |
- en
|
6 |
+
---
|
7 |
+
|
8 |
+
# Tiny News
|
9 |
+
|
10 |
+
For a detailed overview of this project from start to finish, check out [GregSamek.github.io/tinynews](https://GregSamek.github.io/tinynews)
|
11 |
+
|
12 |
+
TinyNews is a collection of one million synthetically generated news bulletins and several language models scratch-trained on this data. Evaluations suggests that TinyNews retains ~80% of the quality of the training data while using ~1/1000th the number of parameters as the models used to generate it.
|
13 |
+
|
14 |
+
To run these models, git clone [the repository](https://github.com/gregsamek/TinyNews)
|
15 |
+
|
16 |
+
Trained models and training data are available in this [🤗 Hugging Face Collection](https://huggingface.co/collections/GregSamek/tinynews-668aff540bf195d6e5e0e40f)
|
17 |
+
|
18 |
+
This project is essentially a modified reimplementation of the Microsoft Research [TinyStories](https://arxiv.org/abs/2305.07759) project.
|