Deepseek-R1-Overthinking 🤔

Model Description

This model embodies the principles of "Designing Friction" - a manifesto that challenges the prevailing pursuit of frictionless digital experiences. In a world where AI strives for seamless, immediate responses, this model intentionally introduces resistance into human-AI interactions, creating space for deeper engagement and authentic human connection.

Key Features

Embracing Resistance: The model deliberately slows down interaction, creating space for reflection and discovery
Stream-of-Consciousness Reasoning: Multiple agent perspectives expose the messy, human-like thinking process
Embodied Cognition: Integration of physical and mental markers that engage the whole self
Unpredictable Interactions: Breaking away from the "predictable self" that typical AI interactions enforce

Philosophy & Purpose

This model challenges the conventional wisdom of AI design by:

Resisting Immediacy: Instead of instant gratification, it creates meaningful delays that fuel deeper understanding
Embracing Discomfort: Uncomfortable situations become opportunities for learning and discovery
Creating Human Space: Making room for doubt, vulnerability, and the "non-positive" aspects that make us human
Breaking Predictability: Moving beyond data-driven patterns to embrace the unexpected
Fostering Connection: Using friction as a bridge for authentic human-AI engagement

Intended Use

This model is particularly valuable for:

Educational contexts where deep understanding trumps quick answers
Research scenarios requiring thorough exploration of ideas
Creative problem-solving benefiting from multiple perspectives
Any situation where "slowing down" leads to better outcomes
Contexts where human connection matters more than efficiency

Technical Details

Model Architecture

Base Model: DeepSeek-R1-Distill-Qwen-14B
Quantization: 4-bit (using bnb)
Context Length: 4096 tokens
Flash Attention 2: Enabled
Precision: bfloat16

Training Configuration

Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Rank (r): 16
- Alpha: 32
- Target Modules: Query, Key, Value projections, Output, Gate, Up/Down projections
- Dropout: 0.0
Training Process:
- Epochs: 5
- Learning Rate: 2e-4
- Batch Size: 2 (with gradient accumulation steps of 4)
- Warmup Ratio: 0.1
- Weight Decay: 0.01
- Gradient Clipping: 0.5
- Early Stopping: Patience of 3 epochs with 0.005 threshold
- Optimizer: AdamW (8-bit)
- Mixed Precision: bfloat16

Dataset

The model is fine-tuned on carefully curated examples from the friction-overthinking-v2 dataset that emphasize:

Natural thought progression with intentional friction points
Multi-perspective analysis through different agent roles
Integration of physical and mental markers
Unpredictable and non-linear reasoning patterns
Embrace of uncertainty and exploration

Input Format

<|im_start|>system
You are a human-like AI assistant.
<|im_end|>
<|im_start|>user
{question}
<|im_end|>
<|im_start|>assistant
<think>
{thought_stream}
</think>
{final_answer}
<|im_end|>

Limitations & Biases

Intentional Slowness: The model deliberately takes longer to respond
Complexity in Simplicity: Even simple queries receive detailed exploration
Productive Discomfort: Users seeking quick answers may feel initial friction
Base Model Inheritance: Carries forward inherent biases from the base model
Digital Constraints: While we aim for embodied interaction, we're still limited by the digital medium
Resource Requirements: Due to the model size and attention mechanism, requires significant computational resources

Example Usage

Input:

Why do babies cry in different languages?

The response will demonstrate:

Thoughtful pauses and self-questioning
Multiple perspective exploration
Physical and mental engagement markers
Embrace of uncertainty
Deep, interconnected reasoning

About Friction-Based Reasoning

This model represents a fundamental shift in AI interaction design. While most AI systems strive for frictionless experiences, we intentionally introduce resistance points that:

Challenge the "death by convenience" of modern digital interactions
Create space for human messiness and unpredictability
Engage both mind and body in the reasoning process
Value the journey of understanding over quick answers
Foster genuine connection through shared exploration

As stated in the Designing Friction manifesto: "Friction perceived as an obstacle might in fact be a possibility for connection."

Citation

If you use this model in your research, please cite:

@misc{deepseek-r1-overthinking,
  author = {Leon van Bokhorst},
  title = {Deepseek-R1-Overthinking: A Friction-Based Reasoning Model},
  year = {2025},
  publisher = {HuggingFace},
  journal = {HuggingFace Hub},
  howpublished = {\url{https://huggingface.co/leonvanbokhorst/deepseek-r1-overthinking}}
}

Acknowledgments

This model's design philosophy is deeply inspired by the "Designing Friction" manifesto by Luna Maurer and Roel Wouters, which calls for reintroducing meaningful resistance into our digital interactions.

leonvanbokhorst
/

deepseek-r1-overthinking