Reasoning Datasets Competition

Community Article Published April 9, 2025

Upvote

bespokelabs

bespokelabs

Shreyas Pimpalgaonkar

bespokelabs

bespokelabs

bespokelabs

bespokelabs

bespokelabs

TL;DR: Bespoke Labs, Hugging Face, and Together.ai are launching a competition to find the most innovative reasoning datasets. Create a great proof-of-concept reasoning dataset and win prizes to help you scale your work!

The Deepseek moment for datasets

Since the launch of DeepSeek-R1 in January 2025, we've seen remarkable growth in reasoning-focused datasets on the Hugging Face Hub, such as OpenThoughts-114k, OpenCodeReasoning, and codeforces-cot. These primarily cover math, coding, and science: domains with clearly verifiable answers.

Now, reasoning is expanding into:

OpenThoughts-114k alone has helped train over 230 models! We believe future breakthroughs won’t come from architecture alone, but from better data, datasets that reflect real-world complexity, uncertainty, and richness.

To accelerate progress, we're launching a Reasoning Dataset Competition.

How the competition works

The goal: create impactful proof-of-concept reasoning datasets and share them on the Hugging Face Hub. The best submissions will win prizes to help scale these datasets and train models using them.

🗓️ Timeline

Launch Date: April 9, 2025
(New) Submission Deadline: May 9, 2025 (11:59 PM PT) (extended from May 1)
Winners Announced: May 19, 2025

🚀 Submission Instructions

Create a dataset with at least 100 examples
Upload to the Hugging Face Hub
Tag it with reasoning-datasets-competition

We'll evaluate 100 examples per submission (or all if you submit exactly 100).

✅ Submission Requirements

Size: Minimum 100 examples
Documentation: Include a dataset card with:
- Purpose and scope
- Dataset creation method
- Example uses
- Limitations or biases
Viewer Preview: Must work on the HF viewer
Tag: reasoning-datasets-competition
License: Clear licensing info for research use

💡 While these are the minimum requirements we encourage you to go beyond these! Think of your dataset card as your pitch. It’s your chance to showcase what makes your dataset the best, and help judges see why you deserve a high score across our evaluation criteria: Approach, Domain, and Quality.

🔍 What We're Looking For

New Domains

Legal reasoning: Judgments based on laws and precedents
Financial analysis: Evaluation of investments
Literary interpretation: Symbolism and theme analysis
Ethics/philosophy: Moral reasoning and frameworks

Novel Tasks

Structured data extraction from unstructured text example
Zero-shot classification: Datasets focused on training smaller models to be more effective zero-shot classifiers through reasoning
Search improvement: Reasoning datasets designed to enhance search relevance and accuracy
Diagrammatic reasoning: Datasets that train models to interpret, analyze, and reason about visual representations like flowcharts, system diagrams, or decision trees
Constraint satisfaction problems: Collections teaching models to reason through complex scheduling, resource allocation, or optimization scenarios with multiple interdependent constraints
Evidence evaluation: Datasets demonstrating how to assess source credibility and weigh conflicting information
Counterfactual reasoning: Collections developing "what if" thinking by systematically altering variables and exploring potential outcomes

Reasoning Distillation

Inspired by the DeepSeek paper: distill reasoning from large to smaller models.

Support a reasoning Ecosystem

Beyond direct reasoning datasets, we're interested in collections that help build a robust reasoning ecosystem. This could include:

Reasoning classification: Datasets for training models to classify or annotate different types of reasoning
Error detection: datasets for training models to identify flaws in reasoning processes

This area is one where you can potentially make a big impact without needing a lot of resources to get started.

🧪 Evaluation Criteria

Dimension	What It Covers	What We Value
Approach	Dataset creation method: tools, prompts, pipelines	Novelty and scalability
Domain	Domain or skill covered	Real-world relevance and coverage of underexplored fields
Quality	Clarity, diversity, and structure of examples	Reasoning-rich prompts and minimal hallucination

🏆 Prizes

🥇 First Place

$1,500 API credits from Together.ai
$1,500 Amazon (or country-specific equivalent) gift card
Hugging Face Pro subscription + compute credits

🥈 1st & 2nd Runner-Up

$500 Amazon (or country-specific equivalent) gift card
HF Pro subscription + compute credits

🌟 Spotlight Awards

Top 4 innovative uses of Curator, each get a $250 Amazon (or country-specific equivalent) gift card

🎁 All Participants

$50 in Together.ai API credits (details on how to claim credits in FAQ below)

📝 Signup Instructions

Step 1: Register here to recieve Together.ai credit and updates on the competition

Step 2: Join the discussion thread

Step 3: Join #reasoning-dataset-competition on Discord

🧰 Helpful Resources

❓ FAQ

Q: Can I submit multiple datasets?
A: Yes!

Q: Can I collaborate with others?
A: Absolutely. Teams are welcome.

Q: How to claim Together AI credits?
A: Fill this questionaire on Together's website. Enter hackathon name (question 6) as 'Reasoning datasets competition'. Here's a walkthrough.

Q: Do I have to use Curator?
A: No. Use any tools or methods you like.

Q: Do I have to use LLMs or synthetic data?
A: Not at all. All methodologies are welcome.

Got more questions? Drop by the HF discussion thread or chat on Discord!

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote