DMind-1-mini / README.md
Frankai123's picture
Update README.md
6ae928a verified
|
raw
history blame
9.2 kB
metadata
license: mit
language:
  - en
  - zh
metrics:
  - accuracy
base_model:
  - Qwen/Qwen3-14B
pipeline_tag: text-generation
library_name: transformers
tags:
  - blockchain
  - conversational
  - web3
  - qwen3
eval_results:
  - task: domain-specific evaluation
    dataset: DMindAI/DMind_Benchmark
    metric: normalized web3 score
    score: 74.12
    model: DMind-1-mini
    model_rank: 2 / 24

DMind Logo


Table of Contents

Introduction

We introduce DMind-1, a domain-specialized LLM fine-tuned for the Web3 ecosystem via supervised instruction tuning and reinforcement learning from human feedback (RLHF).

To support real-time and resource-constrained applications, we further introduce DMind-1-mini, a compact variant distilled from both DMind-1 and a generalist LLM using a multi-level distillation framework. It retains key domain reasoning abilities while operating with significantly lower computational overhead.

DMind-1 and DMind-1-mini represent a robust foundation for intelligent agents in the Web3 ecosystem.

1. Model Overview

DMind-1-mini

To address scenarios requiring lower latency and faster inference, we introduce DMind-1-mini, a lightweight distilled version of DMind-1 based on Qwen3-14B. DMind-1-mini is trained using knowledge distillation and our custom DeepResearch framework, drawing from two teacher models:

  • DMind-1 (Qwen3-32B): Our specialized Web3 domain model.
  • GPT-o3 + DeepResearch: A general-purpose SOTA LLM, with its outputs processed through our DeepResearch framework for Web3 domain alignment.

The Distillation pipeline combines:

  • Web3-specific data distillation: High-quality instruction-following and QA examples generated by the teacher models.

  • Distribution-level supervision: The student model learns to approximate the teachers' output distributions through soft-label guidance, preserving nuanced prediction behavior and confidence calibration.

  • Intermediate representation transfer: Knowledge is transferred by aligning intermediate representations between teacher and student models, promoting deeper structural understanding beyond surface-level mimicry.

This multi-level distillation strategy enables DMind-1-mini to maintain high Web3 task performance while significantly reducing computational overhead and latency, making it suitable for real-time applications such as instant Q&A, on-chain analytics, and lightweight agent deployment.

2. Evaluation Results

DMind-1 Web3 Performance

We evaluate DMind-1 and DMind-1-mini using the DMind Benchmark, a domain-specific evaluation suite designed to assess large language models in the Web3 context. The benchmark includes 1,917 expert-reviewed questions across nine core domain categories, and it features both multiple-choice and open-ended tasks to measure factual knowledge, contextual reasoning, and other abilities.

To complement accuracy metrics, we conducted a cost-performance analysis by comparing benchmark scores against publicly available input token prices across 24 leading LLMs. In this evaluation:

  • DMind-1 achieved the highest Web3 score while maintaining one of the lowest token input costs among top-tier models such as Grok 3 and Claude 3.5 Sonnet.

  • DMind-1-mini ranked second, retaining over 95% of DMind-1’s performance with greater efficiency in latency and compute.

Both models are uniquely positioned in the most favorable region of the score vs. price curve, delivering state-of-the-art Web3 reasoning at significantly lower cost. This balance of quality and efficiency makes the DMind models highly competitive for both research and production use.

3. Use Cases

  • Expert-Level Question & Answering: Provides accurate, context-aware answers on blockchain, DeFi, smart contracts, and related Web3 topics.
  • Compliance-Aware Support: Assists in drafting or reviewing content within regulatory and legal contexts.
  • Content Generation in Domain: Produces Web3-specific blog posts, documentation, and tutorials tailored to developers and users.
  • DeFi Strategy Suggestions: Generates insights and recommendations for yield farming, liquidity provision, and portfolio strategies based on user-provided data.
  • Risk Management: Suggests strategies aligned with user risk profiles for more informed decision-making in volatile markets.

4. Quickstart

4.1 Model Downloads

Model Base Model Download
DMind-1-mini Qwen3-14B Hugging Face Link

4.2 OpenRouter API

You can access DMind-1-mini via the OpenRouter API. Simply specify the desired model in the model field of your request payload.

API Endpoint:

https://openrouter.ai/api/v1/chat/completions

Authentication:

  • Obtain your API key from OpenRouter
  • Include it in the Authorization header as Bearer YOUR_API_KEY

Model Identifiers:

  • DMind-1-mini — Full-size expert model

Example Request (Python):

import requests

headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

data = {
    "model": "DMind-1-mini",
    "messages": [
        {"role": "user", "content": "Explain DeFi in simple terms."}
    ]
}

response = requests.post(
    "https://openrouter.ai/api/v1/chat/completions",
    headers=headers,
    json=data
)
print(response.json())

Example Request (cURL):

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "DMind-1-mini",
    "messages": [{"role": "user", "content": "What is a smart contract?"}]
  }'

Notes:

  • Replace YOUR_API_KEY with your actual OpenRouter API key.
  • Change the model field to DMind-1-mini as needed.
  • Both models support the same API structure for easy integration.

4.3 OpenRouter Web Chat

You can try DMind-1-mini instantly using the OpenRouter Web Chat.

  • Select your desired model from the dropdown menu (DMind-1-mini).
  • Enter your prompt and interact with the model in real time.

OpenRouter Chat

License

  • The code repository and model weights for DMind-1-mini is released under the MIT License.
  • Commercial use, modification, and derivative works (including distillation and fine-tuning) are permitted.
  • Base Models:
    • DMind-1-mini is derived from Qwen3-14B, originally licensed under the Qwen License.
    • Please ensure compliance with the original base model licenses when using or distributing derivatives.

Contact

For questions or support, please contact [email protected]