Joseph717171 commited on
Commit
33ff0cf
·
verified ·
1 Parent(s): d9adedb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -2
README.md CHANGED
@@ -4,10 +4,15 @@ library_name: transformers
4
  tags:
5
  - mergekit
6
  - merge
7
-
8
  ---
 
 
9
  # Mistral-12.25B-Instruct-v0.2
10
 
 
 
 
11
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
 
13
  ## Merge Details
@@ -27,6 +32,9 @@ The following YAML configuration was used to produce this model:
27
  ```yaml
28
  dtype: bfloat16
29
  merge_method: passthrough
 
 
 
30
  slices:
31
  - sources:
32
  - layer_range: [0, 28]
@@ -35,4 +43,4 @@ slices:
35
  - layer_range: [4, 32]
36
  model: /Users/jsarnecki/opt/Workspace/mistralai/Mistral-7B-Instruct-v0.2
37
 
38
- ```
 
4
  tags:
5
  - mergekit
6
  - merge
7
+ license: apache-2.0
8
  ---
9
+
10
+ # Credit for the model card's description goes to ddh0 and mergekit
11
  # Mistral-12.25B-Instruct-v0.2
12
 
13
+ This is # Mistral-12.25B-Instruct-v0.2, a depth-upscaled version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2).
14
+
15
+ This model is intended to be used as a basis for further fine-tuning, or as a drop-in upgrade from the original 7 billion parameter model.
16
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
17
 
18
  ## Merge Details
 
32
  ```yaml
33
  dtype: bfloat16
34
  merge_method: passthrough
35
+ # Depth UpScaled (DUS) version of Mistral-7B-v0.2
36
+ # where m = 4 (The number of layers to remove from the model)
37
+ # s = 56 (The number of layers the model will have after the DUS)
38
  slices:
39
  - sources:
40
  - layer_range: [0, 28]
 
43
  - layer_range: [4, 32]
44
  model: /Users/jsarnecki/opt/Workspace/mistralai/Mistral-7B-Instruct-v0.2
45
 
46
+ ```