Update README.md
Browse files
README.md
CHANGED
@@ -52,7 +52,7 @@ Doge uses Dynamic Mask Attention as sequence transformation and can use Multi-La
|
|
52 |
|
53 |
## Model Details
|
54 |
|
55 |
-
We build the Doge by doing Per-Training on [Smollm-Corpus](https://huggingface.co/datasets/HuggingFaceTB/smollm-corpus). If you want to continue pre-training this model, you can find the unconverged checkpoint [here](https://huggingface.co/SmallDoge/Doge-
|
56 |
|
57 |
|
58 |
**Pre-Training**:
|
|
|
52 |
|
53 |
## Model Details
|
54 |
|
55 |
+
We build the Doge by doing Per-Training on [Smollm-Corpus](https://huggingface.co/datasets/HuggingFaceTB/smollm-corpus). If you want to continue pre-training this model, you can find the unconverged checkpoint [here](https://huggingface.co/SmallDoge/Doge-60M-checkpoint). These models has not been fine-tuned for instruction, the instruction model is [here](https://huggingface.co/SmallDoge/Doge-60M-Instruct).
|
56 |
|
57 |
|
58 |
**Pre-Training**:
|