voidful commited on
Commit
5211b29
·
verified ·
1 Parent(s): 6688013

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +77 -2
README.md CHANGED
@@ -5,9 +5,84 @@ tags: []
5
 
6
  # Model Card for Model ID
7
 
8
- <!-- Provide a quick summary of what the model is/does. -->
9
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
 
 
11
 
12
  ## Model Details
13
 
 
5
 
6
  # Model Card for Model ID
7
 
8
+ Patched LLama 3.2 8B from LLaMA 3.2 11B Model
9
+
10
+ Here’s the complete, refined code for patching the weights:
11
+ ```python
12
+ # Import required libraries
13
+ from transformers import AutoProcessor, AutoTokenizer, AutoModelForImageTextToText, AutoModelForCausalLM
14
+
15
+ # Load the 11B Vision-Instruct model
16
+ processor = AutoProcessor.from_pretrained("meta-llama/Llama-3.2-11B-Vision-Instruct")
17
+ model = AutoModelForImageTextToText.from_pretrained("meta-llama/Llama-3.2-11B-Vision-Instruct")
18
+
19
+ # Load the 8B text-only model
20
+ s_tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
21
+ s_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
22
+
23
+ # Prepare input text for testing
24
+ input_text = "Write me a poem about Machine Learning."
25
+ input_ids = s_tokenizer(input_text, return_tensors="pt")
26
+
27
+ # Test the original 8B model
28
+ outputs = s_model.generate(**input_ids, do_sample=False, max_new_tokens=10)
29
+ print("8B Model Output:", s_tokenizer.decode(outputs[0]))
30
+
31
+ # Patch weights from the 11B model into the 8B model
32
+ model_weight = model.state_dict()
33
+ s_model_dict = s_model.state_dict()
34
+ skip_layer = 0 # Track skipped layers
35
+
36
+ for key in s_model_dict.keys():
37
+ if "layers." in key:
38
+ layer_idx = int(key.split("layers.")[1].split(".")[0]) # Extract layer index
39
+ try:
40
+ s_model_dict[key] = model_weight[
41
+ "language_model." + key.replace(f"layers.{layer_idx}.", f"layers.{layer_idx + skip_layer}.")
42
+ ]
43
+ except KeyError:
44
+ skip_layer += 1
45
+ s_model_dict[key] = model_weight[
46
+ "language_model." + key.replace(f"layers.{layer_idx}.", f"layers.{layer_idx + skip_layer}.")
47
+ ]
48
+ else:
49
+ s_model_dict[key] = model_weight["language_model." + key]
50
+
51
+ # Test the patched 8B model
52
+ outputs = s_model.generate(**input_ids, do_sample=False, max_new_tokens=10)
53
+ print("Patched 8B Model Output:", s_tokenizer.decode(outputs[0]))
54
+
55
+ # Test the original 11B model
56
+ outputs = model.generate(**input_ids, do_sample=False, max_new_tokens=10)
57
+ print("11B Model Output:", s_tokenizer.decode(outputs[0]))
58
+
59
+ ```
60
+
61
+ ### **Example Outputs**
62
+
63
+ **Prompt:** "Write me a poem about Machine Learning."
64
+
65
+ **Outputs:**
66
+ 1. **8B Model Output (Before Patching):**
67
+ ```
68
+ <|begin_of_text|>Write me a poem about Machine Learning.
69
+ Artificial minds, born from code,
70
+ Learning
71
+ ```
72
+
73
+ 2. **Patched 8B Model Output:**
74
+ ```
75
+ <|begin_of_text|>Write me a poem about Machine Learning.
76
+ In silicon halls, where data reigns
77
+ ```
78
+
79
+ 3. **11B Model Output:**
80
+ ```
81
+ <|begin_of_text|>Write me a poem about Machine Learning.
82
+ In silicon halls, where data reigns
83
+ ```
84
 
85
+ ---
86
 
87
  ## Model Details
88