Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Skier8402
's Collections
Realtime-apps
Leaderboards
Quantization tools
3Dmodels
Reasoning-models
Embedding models
Swahili models
multimodal
Diffusion model tools
metrics
RAG-agents
Speech apps
Prompts
Interesting finds
Chat-agents
Datasets
LLM-transparency-tools
Data creation
Computer vision
Datasets
updated
3 days ago
Interesting datasets to help train LLMs and beyond
Upvote
-
Open-Orca/OpenOrca
Viewer
•
Updated
Feb 19
•
2.94M
•
10.7k
•
1.38k
NeelNanda/pile-10k
Viewer
•
Updated
Oct 14, 2022
•
10k
•
6.94k
•
20
legacy-datasets/mc4
Updated
Mar 5, 2024
•
36.3k
•
151
oscar-corpus/oscar
Updated
Mar 21, 2024
•
43k
•
184
deepset/prompt-injections
Viewer
•
Updated
Jul 30, 2024
•
662
•
1.77k
•
58
epfl-llm/guidelines
Viewer
•
Updated
Mar 7, 2024
•
38k
•
1.33k
•
124
wanng/midjourney-v5-202304-clean
Viewer
•
Updated
May 24, 2024
•
1.7M
•
177
•
89
CohereForAI/aya_dataset
Viewer
•
Updated
Jun 28, 2024
•
206k
•
2.47k
•
303
google/fleurs
Updated
Aug 25, 2024
•
33.7k
•
278
HuggingFaceTB/cosmopedia
Viewer
•
Updated
Aug 12, 2024
•
31.1M
•
37.5k
•
598
microsoft/orca-math-word-problems-200k
Viewer
•
Updated
Mar 4, 2024
•
200k
•
1.92k
•
448
HuggingFaceFW/fineweb
Viewer
•
Updated
Jan 31
•
25B
•
228k
•
2.07k
proj-persona/PersonaHub
Viewer
•
Updated
24 days ago
•
375k
•
11.4k
•
549
nyu-visionx/Cambrian-10M
Preview
•
Updated
Jul 8, 2024
•
10k
•
108
BAAI/Infinity-Instruct
Viewer
•
Updated
Feb 25
•
20.4M
•
5.14k
•
607
NousResearch/hermes-function-calling-v1
Viewer
•
Updated
Aug 30, 2024
•
11.6k
•
2.1k
•
275
meta-llama/Llama-3.1-405B-Instruct
Text Generation
•
Updated
Sep 25, 2024
•
41.2k
•
•
568
OpenAssistant/oasst2
Viewer
•
Updated
Jan 11, 2024
•
135k
•
1.97k
•
250
OpenAssistant/oasst1
Viewer
•
Updated
May 2, 2023
•
88.8k
•
9.46k
•
1.37k
HuggingFaceTB/smoltalk
Viewer
•
Updated
Feb 10
•
2.2M
•
8.25k
•
317
NovaSky-AI/Sky-T1_data_17k
Viewer
•
Updated
Jan 14
•
16.4k
•
1.4k
•
179
cognitivecomputations/dolphin-r1
Viewer
•
Updated
Jan 30
•
814k
•
3.11k
•
275
HuggingFaceFW/fineweb-2
Viewer
•
Updated
Jan 8
•
12.5B
•
62.1k
•
450
HuggingFaceFW/fineweb-edu
Viewer
•
Updated
Jan 31
•
3.3B
•
366k
•
655
open-thoughts/OpenThoughts-114k
Viewer
•
Updated
Feb 20
•
228k
•
33.5k
•
672
open-r1/OpenR1-Math-220k
Viewer
•
Updated
Feb 18
•
450k
•
50.8k
•
529
lelapa/Inkuba-Mono
Viewer
•
Updated
Sep 5, 2024
•
68.8M
•
34
•
13
lelapa/Inkuba-instruct
Viewer
•
Updated
Sep 5, 2024
•
212M
•
449
•
8
mozilla-foundation/common_voice_17_0
Viewer
•
Updated
Jun 16, 2024
•
13M
•
36.1k
•
241
intronhealth/afrimedqa_v2
Viewer
•
Updated
Feb 10
•
15.3k
•
106
•
8
intronhealth/afrispeech-dialog
Preview
•
Updated
Oct 28, 2024
•
135
•
2
intronhealth/afrispeech-200
Updated
Nov 20, 2023
•
845
•
25
arcinstitute/opengenome2
Preview
•
Updated
Feb 18
•
7.2k
•
76
facebook/natural_reasoning
Viewer
•
Updated
Feb 21
•
1.15M
•
13.6k
•
464
Jofthomas/hermes-function-calling-thinking-V1
Viewer
•
Updated
Feb 16
•
3.57k
•
3.39k
•
29
CohereForAI/Global-MMLU
Viewer
•
Updated
8 days ago
•
602k
•
19.7k
•
116
FreedomIntelligence/medical-o1-reasoning-SFT
Viewer
•
Updated
Feb 22
•
50.1k
•
26.3k
•
568
glaiveai/glaive-function-calling-v2
Viewer
•
Updated
Sep 27, 2023
•
113k
•
1.71k
•
420
Upvote
-
Share collection
View history
Collection guide
Browse collections