SoT_DistilBERT: Paradigm Selection Model for Sketch-of-Thought

What is Sketch-of-Thought?

Sketch-of-Thought (SoT) is a novel prompting framework for efficient reasoning in language models that combines cognitive-inspired reasoning paradigms with linguistic constraints to minimize output token usage while preserving reasoning accuracy.

Unlike conventional Chain of Thought (CoT) approaches that produce verbose reasoning chains, SoT implements three distinct reasoning paradigms:

Conceptual Chaining: Connects essential ideas in logical sequences through structured step links. Effective for commonsense reasoning, multi-hop inference, and fact-based recall tasks.
Chunked Symbolism: Organizes numerical and symbolic reasoning into structured steps with equations, variables, and arithmetic operations. Excels in mathematical problems and technical calculations.
Expert Lexicons: Leverages domain-specific shorthand, technical symbols, and jargon for precise and efficient communication. Suited for technical disciplines requiring maximum information density.

Loading the Model

This repository contains the DistilBERT paradigm selection model for the Sketch-of-Thought (SoT) framework. You can load and use it directly with Hugging Face Transformers:

from transformers import DistilBertTokenizer, DistilBertForSequenceClassification
import torch
import json

# Load the model directly from Hugging Face
model = DistilBertForSequenceClassification.from_pretrained("saytes/SoT_DistilBERT")
tokenizer = DistilBertTokenizer.from_pretrained("saytes/SoT_DistilBERT")

# Define label mapping
label_mapping = {
   "chunked_symbolism": 0,
   "conceptual_chaining": 1,
   "expert_lexicons": 2
}

# Function to classify questions
def classify_question(question):
    inputs = tokenizer(question, return_tensors="pt", truncation=True, padding=True)
    outputs = model(**inputs)
    predicted_class = torch.argmax(outputs.logits, dim=1).item()
    
    # Reverse mapping to get the paradigm name
    label_mapping_reverse = {v: k for k, v in label_mapping.items()}
    return label_mapping_reverse[predicted_class]

# Example usage
question = "Alice has 5 apples. She gives 3 apples to Bob. How many apples does Alice have?"
paradigm = classify_question(question)
print(f"Recommended paradigm: {paradigm}")  # Output: "chunked_symbolism"

For easier integration, we also provide a complete Python package implementation. See the GitHub repository or the "Complete Package" section below for details.

Model Description

The SoT_DistilBERT model is a fine-tuned DistilBERT classifier trained to select the optimal reasoning paradigm for a given query based on the Sketch-of-Thought framework.

Training Data

The model was trained on approximately 14,200 samples across various reasoning tasks, with each sample labeled using one of the three SoT paradigms. Labels were assigned using GPT-4o with a classification-specific prompt based on predefined heuristics.

Model Architecture

Base model: DistilBERT
Training: 5 epochs, batch size 64, learning rate 2e-5
Loss: Cross-entropy

Complete Package

For a more streamlined experience, we've developed the SoT Python package that handles paradigm selection, prompt management, and exemplar formatting:

from sketch_of_thought import SoT

# Initialize SoT
sot = SoT()

# Classify a question and get appropriate paradigm
question = "Alice has 5 apples. She gives 3 apples to Bob. How many apples does Alice have?"
paradigm = sot.classify_question(question)  # Returns: 'chunked_symbolism'

# Get initialized context with exemplars for the selected paradigm
context = sot.get_initialized_context(
    paradigm=paradigm, 
    question=question, 
    format="llm",
    include_system_prompt=True
)

# Use with your LLM of choice

Example with Qwen2.5-7B

Here's a complete example using Qwen2.5-7B-Instruct:

from transformers import AutoModelForCausalLM, AutoTokenizer
from sketch_of_thought import SoT

# Initialize SoT
sot = SoT()

# Load Qwen model
model_name = "Qwen/Qwen2.5-7B-Instruct"
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Prepare the question
prompt = "Alice has 5 apples. She gives 3 apples to Bob. How many apples does Alice have?"

# Classify and get appropriate context
paradigm = sot.classify_question(prompt)
messages = sot.get_initialized_context(
    paradigm,
    prompt,
    format="llm",
    include_system_prompt=True
)

# Format for the model
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# Generate response
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

# Decode response
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

Output:

<think>
A = 5
A -= 3
A = 2
</think>

\boxed{2}

Supported Formats

The SoT package supports multiple output formats:

"llm": Standard chat format for text-only LLMs
"vlm": Multimodal format for vision-language models
"raw": Raw exemplars without formatting

What's the difference?

LLM Format

Standard messages format for Large Language Models.

[
  {
    "role": "system", 
    "content": "SYSTEM_PROMPT_HERE"
  },
  {
    "role": "user", 
    "content": "EXAMPLE_QUESTION_HERE"
  },
  {
    "role": "assistant", 
    "content": "EXAMPLE_ANSWER_HERE"
  },
  {
    "role": "user", 
    "content": "USER_QUESTION_HERE"
  }
]

VLM Format

Standard messages format for Large Vision-Language Models.

[
  {
    "role": "system", 
    "content": "SYSTEM_PROMPT_HERE"
  },
  {
    "role": "user", 
    "content": [{"type": "text", "text": "EXAMPLE_QUESTION_HERE"}]
  },
  {
    "role": "assistant", 
    "content": [{"type": "text", "text": "EXAMPLE_ANSWER_HERE"}]
  },
  {
    "role": "user", 
    "content": [{"type": "text", "text": "USER_QUESTION_HERE"}]
  }
]

Raw Format

Raw exemplar data. Apply your own format!

[
  {
    "question": "EXAMPLE_QUESTION_HERE",
    "answer": "EXAMPLE_ANSWER_HERE"
  },
  {
    "question": "EXAMPLE_QUESTION_HERE",
    "answer": "EXAMPLE_ANSWER_HERE"
  }
]

Multilingual Support

SoT supports multiple languages. System prompts and exemplars are automatically loaded in the requested language.

Limitations

The model is trained to classify questions into one of three predefined paradigms and may not generalize to tasks outside the training distribution.
Performance may vary depending on the complexity and domain of the question.

Citation

If you find our work helpful, please cite:

@misc{aytes2025sot,
      title={Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching}, 
      author={Simon A. Aytes and Jinheon Baek and Sung Ju Hwang},
      year={2025},
      eprint={2503.05179},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2503.05179}, 
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

saytes
/

SoT_DistilBERT