thomwolf HF staff commited on
Commit
90a0d08
Β·
verified Β·
1 Parent(s): 1c03947

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -17,3 +17,5 @@ The creation of 🍷 FineWeb involved careful processing and filtering of large
17
  All code and artefacts needed for reproduction are public and built on top of open source libraries, such as the πŸ€— libraries [`datatrove`](https://github.com/huggingface/datatrove/), [`nanotron`](https://github.com/huggingface/nanotron/) or [`lighteval`](https://github.com/huggingface/lighteval/).
18
 
19
  Version 1 of the 🍷 FineWeb dataset is available [here](https://huggingface.co/datasets/HuggingFaceFW/fineweb). Our ablation models can be found [here](https://huggingface.co/collections/HuggingFaceFW/ablation-models-662457b0d213e8c14fe47f32).
 
 
 
17
  All code and artefacts needed for reproduction are public and built on top of open source libraries, such as the πŸ€— libraries [`datatrove`](https://github.com/huggingface/datatrove/), [`nanotron`](https://github.com/huggingface/nanotron/) or [`lighteval`](https://github.com/huggingface/lighteval/).
18
 
19
  Version 1 of the 🍷 FineWeb dataset is available [here](https://huggingface.co/datasets/HuggingFaceFW/fineweb). Our ablation models can be found [here](https://huggingface.co/collections/HuggingFaceFW/ablation-models-662457b0d213e8c14fe47f32).
20
+
21
+ Version 2 of the πŸ₯‚ FineWeb dataset (multilingual extension to +1800 languages/script) is available [here](https://huggingface.co/datasets/HuggingFaceFW/fineweb-2).