eliebak HF staff commited on
Commit
1f21d13
·
verified ·
1 Parent(s): f33ac84

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -14
README.md CHANGED
@@ -1,6 +1,8 @@
1
  ---
2
  library_name: transformers
3
- tags: []
 
 
4
  ---
5
 
6
 
@@ -10,23 +12,16 @@ tags: []
10
  ## Table of Contents
11
 
12
  1. [Model Summary](##model-summary)
13
- 2. [Use](##use)
14
- 3. [Limitations](##limitations)
15
- 4. [Training](##training)
16
- 5. [License](##license)
17
- 6. [Citation](##citation)
18
 
19
  ## Model Summary
20
 
21
  SmolLM is a series of state-of-the-art small language models available in three sizes: 135M, 360M, and 1.7B parameters. These models are built on Cosmo-Corpus, a meticulously curated high-quality training dataset. Cosmo-Corpus includes Cosmopedia v2 (28B tokens of synthetic textbooks and stories generated by Mixtral), Python-Edu (4B tokens of educational Python samples from The Stack), and FineWeb-Edu (220B tokens of deduplicated educational web samples from FineWeb). SmolLM models have shown promising results when compared to other models in their size categories across various benchmarks testing common sense reasoning and world knowledge. For detailed information on training, benchmarks and performance, please refer to our full blog post ADD LINK WHEN PUBLISH.
22
 
23
 
24
- ## Use
25
-
26
- ### Intended use
27
-
28
- The model was trained on [HuggingFaceTB/cosmo-corpus-v2](link)
29
-
30
  ### Generation
31
  First, make sure to install `transformers` from source:
32
  ```bash
@@ -104,7 +99,7 @@ The model has been trained on source code from 600+ programming languages. The p
104
 
105
  ## Model
106
 
107
- - **Architecture:** Transformer decoder with grouped-query and sliding window attention and Fill-in-the-Middle objective
108
  - **Pretraining steps:** 600k
109
  - **Pretraining tokens:** 600B
110
  - **Precision:** bfloat16
@@ -119,7 +114,7 @@ The model has been trained on source code from 600+ programming languages. The p
119
 
120
  # License
121
 
122
- TO MODIFY
123
 
124
  # Citation
125
  TO MODIFY
 
1
  ---
2
  library_name: transformers
3
+ license: apache-2.0
4
+ language:
5
+ - en
6
  ---
7
 
8
 
 
12
  ## Table of Contents
13
 
14
  1. [Model Summary](##model-summary)
15
+ 2. [Limitations](##limitations)
16
+ 3. [Training](##training)
17
+ 4. [License](##license)
18
+ 5. [Citation](##citation)
 
19
 
20
  ## Model Summary
21
 
22
  SmolLM is a series of state-of-the-art small language models available in three sizes: 135M, 360M, and 1.7B parameters. These models are built on Cosmo-Corpus, a meticulously curated high-quality training dataset. Cosmo-Corpus includes Cosmopedia v2 (28B tokens of synthetic textbooks and stories generated by Mixtral), Python-Edu (4B tokens of educational Python samples from The Stack), and FineWeb-Edu (220B tokens of deduplicated educational web samples from FineWeb). SmolLM models have shown promising results when compared to other models in their size categories across various benchmarks testing common sense reasoning and world knowledge. For detailed information on training, benchmarks and performance, please refer to our full blog post ADD LINK WHEN PUBLISH.
23
 
24
 
 
 
 
 
 
 
25
  ### Generation
26
  First, make sure to install `transformers` from source:
27
  ```bash
 
99
 
100
  ## Model
101
 
102
+ - **Architecture:** See the blog post
103
  - **Pretraining steps:** 600k
104
  - **Pretraining tokens:** 600B
105
  - **Precision:** bfloat16
 
114
 
115
  # License
116
 
117
+ [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
118
 
119
  # Citation
120
  TO MODIFY