arcee-ai
/

Mistral-7B-Instruct-v0.2-expanded

Text Generation

block expansion

progressive mistral

text-generation-inference

Model card Files Files and versions Community

Shamane commited on Mar 13, 2024

Commit

a0ca04a

·

verified ·

1 Parent(s): dd3df15

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -11,8 +11,8 @@ tags:
 This method employs mergekit's passthrough method to expand blocks within the "mistralai/Mistral-7B-Instruct-v0.2" model. For every 5th layer,
 a new layer is added, with the `o_proj` and `down_proj` parameters of these added layers initialized to zero, mirroring the approach used in LLaMA Pro.
-It's important to note that this configuration has not undergone fine-tuning. Therefore, when fine-tuning, ensure that only every 5th layer is trainable,
-while all other layers remain frozen.
 ## 🧩 Configuration

 This method employs mergekit's passthrough method to expand blocks within the "mistralai/Mistral-7B-Instruct-v0.2" model. For every 5th layer,
 a new layer is added, with the `o_proj` and `down_proj` parameters of these added layers initialized to zero, mirroring the approach used in LLaMA Pro.
+### It's important to note that this configuration has not undergone fine-tuning. So this won't work. Therefore, when fine-tuning, ensure that only every 5th layer is trainable,while all other layers remain frozen.
 ## 🧩 Configuration