Update README.md
Browse files
README.md
CHANGED
@@ -24,6 +24,11 @@ We take ChatML as our chat template:
|
|
24 |
|
25 |
As we merged the predictors for FFN neurons in models, you can finetune TurboSparse-Mistral with any framework and algorithm.
|
26 |
|
|
|
|
|
|
|
|
|
|
|
27 |
## License
|
28 |
|
29 |
The model is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage.
|
|
|
24 |
|
25 |
As we merged the predictors for FFN neurons in models, you can finetune TurboSparse-Mistral with any framework and algorithm.
|
26 |
|
27 |
+
## Limitations
|
28 |
+
* TurboSparse, having just undergone training with 150B tokens, may still exhibit performance gaps in certain tasks.
|
29 |
+
* The TurboSparse model has only been trained on English-language datasets, hence its capabilities in other languages are still lacking.
|
30 |
+
* The model may produce unexpected outputs due to its small size and probabilistic generation paradigm.
|
31 |
+
|
32 |
## License
|
33 |
|
34 |
The model is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage.
|