deepseek-ai
/

DeepSeek-R1

Text Generation

Model card Files Files and versions Community

Best practice for R1 models evaluation: Reasoning efficiency and Performance by MATH-Level

#198

by wangxingjun778 - opened 1 day ago

1 day ago

EvalScope - LLM Evaluation Framework: https://github.com/modelscope/evalscope

Best Practices for Evaluating R1 Class Model Inference Capabilities
https://evalscope.readthedocs.io/en/latest/best_practice/deepseek_r1_distill.html
Best Practices for Evaluating Reasoning Efficiency
https://evalscope.readthedocs.io/en/latest/best_practice/think_eval.html
Best Practices for Evaluating R1, QwQ Inference Efficiency and Math-level:
https://evalscope.readthedocs.io/en/latest/best_practice/eval_qwq.html

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment