Reasoning Datasets Competition

Community Article Published April 9, 2025

TL;DR: Bespoke Labs, Hugging Face, and Together.ai are launching a competition to find the most innovative reasoning datasets. Create a great proof-of-concept reasoning dataset and win prizes to help you scale your work!

The Deepseek moment for datasets

Since the launch of DeepSeek-R1 in January 2025, we've seen remarkable growth in reasoning-focused datasets on the Hugging Face Hub, such as OpenThoughts-114k, OpenCodeReasoning, and codeforces-cot. These primarily cover math, coding, and science: domains with clearly verifiable answers.

Now, reasoning is expanding into:

OpenThoughts-114k alone has helped train over 230 models! We believe future breakthroughs won’t come from architecture alone, but from better data, datasets that reflect real-world complexity, uncertainty, and richness.

To accelerate progress, we're launching a Reasoning Dataset Competition.

image/png

How the competition works

The goal: create impactful proof-of-concept reasoning datasets and share them on the Hugging Face Hub. The best submissions will win prizes to help scale these datasets and train models using them.

πŸ—“οΈ Timeline

  • Launch Date: April 9, 2025
  • Submission Deadline: May 1, 2025 (11:59 PM PT)
  • Winners Announced: May 5, 2025

πŸš€ Submission Instructions

  1. Create a dataset with at least 100 examples
  2. Upload to the Hugging Face Hub
  3. Tag it with reasoning-datasets-competition

We'll evaluate 100 examples per submission (or all if you submit exactly 100).

βœ… Submission Requirements

  • Size: Minimum 100 examples
  • Documentation: Include a dataset card with:
    • Purpose and scope
    • Dataset creation method
    • Example uses
    • Limitations or biases
  • Viewer Preview: Must work on the HF viewer
  • Tag: reasoning-datasets-competition
  • License: Clear licensing info for research use

πŸ’‘ While these are the minimum requirements we encourage you to go beyond these! Think of your dataset card as your pitch. It’s your chance to showcase what makes your dataset the best, and help judges see why you deserve a high score across our evaluation criteria: Approach, Domain, and Quality.

πŸ” What We're Looking For

New Domains

  • Legal reasoning: Judgments based on laws and precedents
  • Financial analysis: Evaluation of investments
  • Literary interpretation: Symbolism and theme analysis
  • Ethics/philosophy: Moral reasoning and frameworks

Novel Tasks

  • Structured data extraction from unstructured text example
  • Zero-shot classification: Datasets focused on training smaller models to be more effective zero-shot classifiers through reasoning
  • Search improvement: Reasoning datasets designed to enhance search relevance and accuracy
  • Diagrammatic reasoning: Datasets that train models to interpret, analyze, and reason about visual representations like flowcharts, system diagrams, or decision trees
  • Constraint satisfaction problems: Collections teaching models to reason through complex scheduling, resource allocation, or optimization scenarios with multiple interdependent constraints
  • Evidence evaluation: Datasets demonstrating how to assess source credibility and weigh conflicting information
  • Counterfactual reasoning: Collections developing "what if" thinking by systematically altering variables and exploring potential outcomes

Reasoning Distillation

Inspired by the DeepSeek paper: distill reasoning from large to smaller models.

Support a reasoning Ecosystem

Beyond direct reasoning datasets, we're interested in collections that help build a robust reasoning ecosystem. This could include:

  • Reasoning classification: Datasets for training models to classify or annotate different types of reasoning
  • Error detection: datasets for training models to identify flaws in reasoning processes

This area is one where you can potentially make a big impact without needing a lot of resources to get started.

πŸ§ͺ Evaluation Criteria

Dimension What It Covers What We Value
Approach Dataset creation method: tools, prompts, pipelines Novelty and scalability
Domain Domain or skill covered Real-world relevance and coverage of underexplored fields
Quality Clarity, diversity, and structure of examples Reasoning-rich prompts and minimal hallucination

πŸ† Prizes

πŸ₯‡ First Place

  • $1,500 API credits from Together.ai
  • $1,500 Amazon (or country-specific equivalent) gift card
  • Hugging Face Pro subscription + compute credits

πŸ₯ˆ 1st & 2nd Runner-Up

  • $500 Amazon (or country-specific equivalent) gift card
  • HF Pro subscription + compute credits

🌟 Spotlight Awards

Top 4 innovative uses of Curator, each get a $250 Amazon (or country-specific equivalent) gift card

🎁 All Participants

$50 in Together.ai API credits (details on how to claim credits in FAQ below)

πŸ“ Signup Instructions

Step 1: Register here to recieve Together.ai credit and updates on the competition

Step 2: Join the discussion thread

Step 3: Join #reasoning-dataset-competition on Discord

🧰 Helpful Resources

❓ FAQ

Q: Can I submit multiple datasets?
A: Yes!

Q: Can I collaborate with others?
A: Absolutely. Teams are welcome.

Q: How to claim Together AI credits?
A: Fill this questionaire on Together's website. Enter hackathon name (question 6) as 'Reasoning datasets competition'. Here's a walkthrough.

Q: Do I have to use Curator?
A: No. Use any tools or methods you like.

Q: Do I have to use LLMs or synthetic data?
A: Not at all. All methodologies are welcome.

Got more questions? Drop by the HF discussion thread or chat on Discord!


Community

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment