qihoo360
/

Light-R1-32B-DS

Text Generation

text-generation-inference

Model card Files Files and versions Community

zhs12 commited on 14 days ago

Commit

a9c865d

·

verified ·

1 Parent(s): b36c265

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -13,6 +13,8 @@ base_model:
 | [Light-R1-32B (ours) 🤗](https://huggingface.co/qihoo360/Light-R1-32B) |Qwen2.5-32B-Instruct|25.3.4|76.6|64.6|61.8|
 | QwQ-32B |N/A|25.3.6|78.5|69.3|67.7|
 [GitHub page](https://github.com/Qihoo360/Light-R1)
@@ -20,7 +22,7 @@ Light-R1-32B-DS is a near-SOTA 32B math model with AIME24 & 25 scores 78.1 & 65.
 Originated from DeepSeek-R1-Distill-Qwen-32B, Light-R1-32B-DS is further trained with only [3K SFT data](https://huggingface.co/datasets/qihoo360/Light-R1-SFTData) as we've open-sourced, demonstrating the strong applicability of the released data.
-We are excited to release this model along with the [technical report](https://github.com/Qihoo360/Light-R1/blob/main/Light-R1.pdf).
 ## Usage
 Same as DeepSeek-R1-Distill-Qwen-32B.

 | [Light-R1-32B (ours) 🤗](https://huggingface.co/qihoo360/Light-R1-32B) |Qwen2.5-32B-Instruct|25.3.4|76.6|64.6|61.8|
 | QwQ-32B |N/A|25.3.6|78.5|69.3|67.7|
+[technical report](https://arxiv.org/abs/2503.10460)
 [GitHub page](https://github.com/Qihoo360/Light-R1)
 Originated from DeepSeek-R1-Distill-Qwen-32B, Light-R1-32B-DS is further trained with only [3K SFT data](https://huggingface.co/datasets/qihoo360/Light-R1-SFTData) as we've open-sourced, demonstrating the strong applicability of the released data.
+We are excited to release this model along with the [technical report](https://arxiv.org/abs/2503.10460).
 ## Usage
 Same as DeepSeek-R1-Distill-Qwen-32B.