Joseph717171 commited on
Commit
2fe5bb2
·
verified ·
1 Parent(s): f167854

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -13,6 +13,9 @@ license: apache-2.0
13
  This is Mistral-12.25B-Instruct-v0.2, a depth-upscaled version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2).
14
 
15
  This model is intended to be used as a basis for further fine-tuning, or as a drop-in upgrade from the original 7 billion parameter model.
 
 
 
16
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
17
 
18
  # UpStage's conclusionary limitations of their research:
 
13
  This is Mistral-12.25B-Instruct-v0.2, a depth-upscaled version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2).
14
 
15
  This model is intended to be used as a basis for further fine-tuning, or as a drop-in upgrade from the original 7 billion parameter model.
16
+
17
+ Paper detailing how Depth-Up Scaling works: [SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling](https://arxiv.org/abs/2312.15166)
18
+
19
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
20
 
21
  # UpStage's conclusionary limitations of their research: