0-hero commited on
Commit
2444474
·
verified ·
1 Parent(s): 6d6a681

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +109 -0
README.md ADDED
@@ -0,0 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ tags:
4
+ - merge
5
+ - mergekit
6
+ - lazymergekit
7
+ base_model:
8
+ - mlabonne/Meta-Llama-3-120B-Instruct
9
+ - mlabonne/Meta-Llama-3-120B-Instruct
10
+ - mlabonne/Meta-Llama-3-120B-Instruct
11
+ - mlabonne/Meta-Llama-3-120B-Instruct
12
+ - mlabonne/Meta-Llama-3-120B-Instruct
13
+ - mlabonne/Meta-Llama-3-120B-Instruct
14
+ - mlabonne/Meta-Llama-3-120B-Instruct
15
+ - mlabonne/Meta-Llama-3-120B-Instruct
16
+ - mlabonne/Meta-Llama-3-120B-Instruct
17
+ - mlabonne/Meta-Llama-3-120B-Instruct
18
+ - mlabonne/Meta-Llama-3-120B-Instruct
19
+ - mlabonne/Meta-Llama-3-120B-Instruct
20
+ - mlabonne/Meta-Llama-3-120B-Instruct
21
+ ---
22
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/X1tDlFYMMFPNI_YkDXYbE.png)
23
+
24
+ # Meta-Llama-3-225B-Instruct
25
+
26
+ - This is quantized version of [mlabonne/Meta-Llama-3-225B-Instruct](https://huggingface.co/mlabonne/Meta-Llama-3-225B-Instruct) created using llama.cpp
27
+
28
+ Meta-Llama-3-225B-Instruct is a self-merge with [meta-llama/Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct).
29
+
30
+ It was inspired by large merges like:
31
+
32
+ - [alpindale/goliath-120b](https://huggingface.co/alpindale/goliath-120b)
33
+ - [nsfwthrowitaway69/Venus-120b-v1.0](https://huggingface.co/nsfwthrowitaway69/Venus-120b-v1.0)
34
+ - [cognitivecomputations/MegaDolphin-120b](https://huggingface.co/cognitivecomputations/MegaDolphin-120b)
35
+ - [wolfram/miquliz-120b-v2.0](https://huggingface.co/wolfram/miquliz-120b-v2.0).
36
+
37
+ I don't recommend using it as it seems to break quite easily (but feel free to prove me wrong).
38
+
39
+ ## 🧩 Configuration
40
+
41
+ ```yaml
42
+ slices:
43
+ - sources:
44
+ - layer_range: [0, 20]
45
+ model: mlabonne/Meta-Llama-3-120B-Instruct
46
+ - sources:
47
+ - layer_range: [10, 30]
48
+ model: mlabonne/Meta-Llama-3-120B-Instruct
49
+ - sources:
50
+ - layer_range: [20, 40]
51
+ model: mlabonne/Meta-Llama-3-120B-Instruct
52
+ - sources:
53
+ - layer_range: [30, 50]
54
+ model: mlabonne/Meta-Llama-3-120B-Instruct
55
+ - sources:
56
+ - layer_range: [40, 60]
57
+ model: mlabonne/Meta-Llama-3-120B-Instruct
58
+ - sources:
59
+ - layer_range: [50, 70]
60
+ model: mlabonne/Meta-Llama-3-120B-Instruct
61
+ - sources:
62
+ - layer_range: [60, 80]
63
+ model: mlabonne/Meta-Llama-3-120B-Instruct
64
+ - sources:
65
+ - layer_range: [70, 90]
66
+ model: mlabonne/Meta-Llama-3-120B-Instruct
67
+ - sources:
68
+ - layer_range: [80, 100]
69
+ model: mlabonne/Meta-Llama-3-120B-Instruct
70
+ - sources:
71
+ - layer_range: [90, 110]
72
+ model: mlabonne/Meta-Llama-3-120B-Instruct
73
+ - sources:
74
+ - layer_range: [100, 120]
75
+ model: mlabonne/Meta-Llama-3-120B-Instruct
76
+ - sources:
77
+ - layer_range: [110, 130]
78
+ model: mlabonne/Meta-Llama-3-120B-Instruct
79
+ - sources:
80
+ - layer_range: [120, 140]
81
+ model: mlabonne/Meta-Llama-3-120B-Instruct
82
+ merge_method: passthrough
83
+ dtype: float16
84
+ ```
85
+
86
+ ## 💻 Usage
87
+
88
+ ```python
89
+ !pip install -qU transformers accelerate
90
+
91
+ from transformers import AutoTokenizer
92
+ import transformers
93
+ import torch
94
+
95
+ model = "mlabonne/Meta-Llama-3-220B-Instruct"
96
+ messages = [{"role": "user", "content": "What is a large language model?"}]
97
+
98
+ tokenizer = AutoTokenizer.from_pretrained(model)
99
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
100
+ pipeline = transformers.pipeline(
101
+ "text-generation",
102
+ model=model,
103
+ torch_dtype=torch.float16,
104
+ device_map="auto",
105
+ )
106
+
107
+ outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
108
+ print(outputs[0]["generated_text"])
109
+ ```