yixinsong nielsr HF Staff commited on
Commit
7f837c6
·
verified ·
1 Parent(s): 512a7f0

Improve model card: Add `library_name`, explicit paper and code links (#3)

Browse files

- Improve model card: Add `library_name`, explicit paper and code links (6526c2455757fa91b7aae2c0aaa2783d627be85e)


Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show
  1. README.md +10 -4
README.md CHANGED
@@ -1,9 +1,16 @@
1
  ---
2
- license: apache-2.0
3
  language:
4
  - en
 
5
  pipeline_tag: text-generation
 
6
  ---
 
 
 
 
 
 
7
  ## Introduction
8
 
9
  <p align="center">
@@ -42,7 +49,7 @@ All models are evaluated in non-thinking mode.
42
  | Qwen3 0.6B | 0.6 | 148.56 | 94.91 | 45.93 | 15.29 | 27.44 | 13.32 | 9.76 |
43
  | Qwen3 1.7B | 1.3 | 62.24 | 41.00 | 20.29 | 6.09 | 11.08 | 6.35 | 4.15 |
44
  | Qwen3 1.7B+limited memory | limit 1G | 2.66 | 1.09 | 1.00 | 0.47 | - | - | 0.11 |
45
- | Gemma3n E2B | 1G, theoretically | 36.88 | 27.06 | 12.50 | 3.80 | 6.66 | 3.46 | 2.45 |
46
 
47
  Note: i9 14900, 1+13 8ge4 use 4 threads, others use the number of threads that can achieve the maximum speed. All models here have been quantized to q4_0.
48
 
@@ -115,5 +122,4 @@ from modelscope import AutoModelForCausalLM, AutoTokenizer
115
  ## Statement
116
  - Due to the constraints of its model size and the limitations of its training data, its responses may contain factual inaccuracies, biases, or outdated information.
117
  - Users bear full responsibility for independently evaluating and verifying the accuracy and appropriateness of all generated content.
118
- - SmallThinker does not possess genuine comprehension or consciousness and cannot express personal opinions or value judgments.
119
-
 
1
  ---
 
2
  language:
3
  - en
4
+ license: apache-2.0
5
  pipeline_tag: text-generation
6
+ library_name: transformers
7
  ---
8
+
9
+ # SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment
10
+
11
+ **Paper**: [SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment](https://huggingface.co/papers/2507.20984)
12
+ **Code**: [https://github.com/SJTU-IPADS/SmallThinker](https://github.com/SJTU-IPADS/SmallThinker)
13
+
14
  ## Introduction
15
 
16
  <p align="center">
 
49
  | Qwen3 0.6B | 0.6 | 148.56 | 94.91 | 45.93 | 15.29 | 27.44 | 13.32 | 9.76 |
50
  | Qwen3 1.7B | 1.3 | 62.24 | 41.00 | 20.29 | 6.09 | 11.08 | 6.35 | 4.15 |
51
  | Qwen3 1.7B+limited memory | limit 1G | 2.66 | 1.09 | 1.00 | 0.47 | - | - | 0.11 |
52
+ | Gemma3n E2B | 1G, theoretically | 36.88 | 27.06 | 12.50 | 3.80 | 6.66 | 3.80 | 2.45 |
53
 
54
  Note: i9 14900, 1+13 8ge4 use 4 threads, others use the number of threads that can achieve the maximum speed. All models here have been quantized to q4_0.
55
 
 
122
  ## Statement
123
  - Due to the constraints of its model size and the limitations of its training data, its responses may contain factual inaccuracies, biases, or outdated information.
124
  - Users bear full responsibility for independently evaluating and verifying the accuracy and appropriateness of all generated content.
125
+ - SmallThinker does not possess genuine comprehension or consciousness and cannot express personal opinions or value judgments.