PowerInfer
/

TurboSparse-Mistral-Instruct

Feature Extraction

Model card Files Files and versions Community

yixinsong commited on Jun 7, 2024

Commit

70b61f3

·

verified ·

1 Parent(s): e1381e5

Update README.md

Files changed (1) hide show

README.md +5 -0

README.md CHANGED Viewed

@@ -24,6 +24,11 @@ We take ChatML as our chat template:
 As we merged the predictors for FFN neurons in models, you can finetune TurboSparse-Mistral with any framework and algorithm.
 ## License
 The model is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage.

 As we merged the predictors for FFN neurons in models, you can finetune TurboSparse-Mistral with any framework and algorithm.
+## Limitations
+* TurboSparse, having just undergone training with 150B tokens, may still exhibit performance gaps in certain tasks.
+* The TurboSparse model has only been trained on English-language datasets, hence its capabilities in other languages are still lacking.
+* The model may produce unexpected outputs due to its small size and probabilistic generation paradigm.
 ## License
 The model is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage.