Update README.md
Browse files
README.md
CHANGED
@@ -11,8 +11,8 @@ tags:
|
|
11 |
|
12 |
This method employs mergekit's passthrough method to expand blocks within the "mistralai/Mistral-7B-Instruct-v0.2" model. For every 5th layer,
|
13 |
a new layer is added, with the `o_proj` and `down_proj` parameters of these added layers initialized to zero, mirroring the approach used in LLaMA Pro.
|
14 |
-
|
15 |
-
while all other layers remain frozen.
|
16 |
|
17 |
## 🧩 Configuration
|
18 |
|
|
|
11 |
|
12 |
This method employs mergekit's passthrough method to expand blocks within the "mistralai/Mistral-7B-Instruct-v0.2" model. For every 5th layer,
|
13 |
a new layer is added, with the `o_proj` and `down_proj` parameters of these added layers initialized to zero, mirroring the approach used in LLaMA Pro.
|
14 |
+
|
15 |
+
### It's important to note that this configuration has not undergone fine-tuning. So this won't work. Therefore, when fine-tuning, ensure that only every 5th layer is trainable,while all other layers remain frozen.
|
16 |
|
17 |
## 🧩 Configuration
|
18 |
|