pcuenq HF staff commited on
Commit
3dc7f7d
·
verified ·
1 Parent(s): d7fb243

Cross-reference datasets

Browse files

This improves discoverability and increases transparency.

To get started, I added the two largest ones (in terms of tokens contributed to training) as mentioned in the model card, feel free to be more exhaustive :)

Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -39,6 +39,9 @@ language:
39
  - sr
40
  - sv
41
  - uk
 
 
 
42
  ---
43
 
44
  ![](./images/logo_alia_2.png)
 
39
  - sr
40
  - sv
41
  - uk
42
+ datasets:
43
+ - oscar-corpus/colossal-oscar-1.0
44
+ - HuggingFaceFW/fineweb-edu
45
  ---
46
 
47
  ![](./images/logo_alia_2.png)