arcee-ai
/

Mistral-7B-Instruct-v0.2-expanded

Text Generation

block expansion

progressive mistral

text-generation-inference

Model card Files Files and versions Community

Shamane commited on Mar 13, 2024

Commit

0a2f79e

·

verified ·

1 Parent(s): 786d7b1

Update README.md

Files changed (1) hide show

README.md +0 -7

README.md CHANGED Viewed

@@ -146,12 +146,6 @@ def enable_grad_only_every_nth(model, n):
     for all other components of the model, including the embedding layers and the model's head. This setup is particularly
     useful for fine-tuning processes where only a subset of layers are targeted for updates, ensuring efficient training and
     adaptation of newly integrated layers while maintaining the pre-trained behavior of other model components.
-    :param model: The model instance, which is expected to have a structure compatible with selective layer training, such
-    as AutoModelForCausalLM.
-    :param n: The interval at which layers are selected for gradient enabling, starting with the first layer. This
-    parameter determines the sparsity of active training within the model's architecture, allowing for focused updates
-    on specific layers.
     """
     # Freeze embeddings.
@@ -180,5 +174,4 @@ model = transformers.AutoModelForCausalLM.from_pretrained(
 # Update layer gradients, specify the correct value for n based on your model's architecture
 n =5
 enable_grad_only_every_nth(model, n)
-model_args.model_name_or_path = model
 ```

     for all other components of the model, including the embedding layers and the model's head. This setup is particularly
     useful for fine-tuning processes where only a subset of layers are targeted for updates, ensuring efficient training and
     adaptation of newly integrated layers while maintaining the pre-trained behavior of other model components.
     """
     # Freeze embeddings.
 # Update layer gradients, specify the correct value for n based on your model's architecture
 n =5
 enable_grad_only_every_nth(model, n)
 ```