monet9736 commited on
Commit
f41f1fb
·
verified ·
1 Parent(s): 96f5f23

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +394 -0
README.md ADDED
@@ -0,0 +1,394 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ pipeline_tag: text-generation
6
+ library_name: transformers
7
+ ---
8
+
9
+ # Monet: Mixture of Monosemantic Experts for Transformers
10
+
11
+ ## Model Summary
12
+
13
+ Monet introduces a novel approach for improving mechanistic interpretability in large language models (LLMs) using a Sparse Mixture-of-Experts (SMoE) architecture with 262,144 experts. By integrating sparse dictionary learning directly into end-to-end pretraining, Monet tackles the core issue of polysemanticity—where single neurons encode multiple unrelated concepts—while preserving overall model performance.
14
+
15
+
16
+ ### Resources and Technical Documentation
17
+
18
+ - **GitHub Repository**: https://github.com/dmis-lab/Monet
19
+ - **Paper**: https://arxiv.org/abs/2412.04139
20
+ - **Model Hub**: https://huggingface.co/MonetLLM
21
+ - **Demo**: https://huggingface.co/spaces/MonetLLM/monet-vd-1.4B-100BT-hf-viewer
22
+
23
+ ### Available Checkpoints
24
+
25
+ #### Base Models
26
+
27
+
28
+ <table class="center">
29
+ <tr>
30
+ <td align="center"><b>Model</b></td>
31
+ <td align="center"><b>Dataset</b></td>
32
+ <td align="center"><b>#Params</b></td>
33
+ <td align="center"><b>#Tokens</b></td>
34
+ <td align="center"><b>Checkpoint</b></td>
35
+ <td align="center"><b>Demo</b></td>
36
+ </tr>
37
+ <tr>
38
+ <td align="center" rowspan="4"><b>Monet-VD</b></td>
39
+ <td align="center" rowspan="3"><a href="https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu">FineWeb-Edu</a></td>
40
+ <td align="center">850M</td>
41
+ <td align="center">100BT</td>
42
+ <td><a href="https://huggingface.co/MonetLLM/monet-vd-850M-100BT-hf">monet-vd-850M-100BT-hf</a></td>
43
+ <td></td>
44
+ </tr>
45
+ <tr>
46
+ <td align="center">1.4B</td>
47
+ <td align="center">100BT</td>
48
+ <td><a href="https://huggingface.co/MonetLLM/monet-vd-1.4B-100BT-hf">monet-vd-1.4B-100BT-hf</a></td>
49
+ <td><a href="https://huggingface.co/spaces/MonetLLM/monet-vd-1.4B-100BT-hf-viewer">Viewer</a></td>
50
+ </tr>
51
+ <tr>
52
+ <td align="center">4.1B</td>
53
+ <td align="center">100BT</td>
54
+ <td><a href="https://huggingface.co/MonetLLM/monet-vd-4.1B-100BT-hf">monet-vd-4.1B-100BT-hf</a></td>
55
+ <td></td>
56
+ </tr>
57
+ <tr>
58
+ <td align="center"><a href="https://huggingface.co/datasets/bigcode/starcoderdata">StarCoderData</a></td>
59
+ <td align="center">1.4B</td>
60
+ <td align="center">100BT</td>
61
+ <td><a href="https://huggingface.co/MonetLLM/codemonet-vd-1.4B-100BT-hf">codemonet-vd-1.4B-100BT-hf</a></td>
62
+ <td><a href="https://huggingface.co/spaces/MonetLLM/codemonet-vd-1.4B-100BT-hf-viewer">Viewer</a></td>
63
+ </tr>
64
+ <tr>
65
+ <td align="center" rowspan="3"><b>Monet-HD</b></td>
66
+ <td align="center" rowspan="3"><a href="https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu">FineWeb-Edu</a></td>
67
+ <td align="center">850M</td>
68
+ <td align="center">100BT</td>
69
+ <td><a href="https://huggingface.co/MonetLLM/monet-hd-850M-100BT-hf">monet-hd-850M-100BT-hf</a></td>
70
+ <td></td>
71
+ </tr>
72
+ <tr>
73
+ <td align="center">1.4B</td>
74
+ <td align="center">100BT</td>
75
+ <td><a href="https://huggingface.co/MonetLLM/monet-hd-1.4B-100BT-hf">monet-hd-1.4B-100BT-hf</a></td>
76
+ <td></td>
77
+ </tr>
78
+ <tr>
79
+ <td align="center">4.1B</td>
80
+ <td align="center">100BT</td>
81
+ <td><a href="https://huggingface.co/MonetLLM/monet-hd-4.1B-100BT-hf">monet-hd-4.1B-100BT-hf</a></td>
82
+ <td></td>
83
+ </tr>
84
+ </table>
85
+
86
+ #### Instruction-Tuned Models
87
+
88
+ <table class="center">
89
+ <tr>
90
+ <td align="center"><b>Model</b></td>
91
+ <td align="center"><b>Purpose</b></td>
92
+ <td align="center"><b>Recipe</b></td>
93
+ <td align="center"><b>#Params</b></td>
94
+ <td align="center"><b>Checkpoint</b></td>
95
+ </tr>
96
+ <tr>
97
+ <td align="center" rowspan="2"><b>Monet-VD</b></td>
98
+ <td align="center">Chat Completion</td>
99
+ <td align="center"><a href="https://github.com/huggingface/alignment-handbook/tree/main/recipes/smollm">SmolLM</a></td>
100
+ <td align="center">1.4B</td>
101
+ <td><a href="https://huggingface.co/MonetLLM/monet-vd-1.4B-100BT-chat-hf">monet-vd-1.4B-100BT-chat-hf</a></td>
102
+ </tr>
103
+ <tr>
104
+ <td align="center">Vision-Language Model</td>
105
+ <td align="center"><a href="https://github.com/haotian-liu/LLaVA">LLaVA</a></td>
106
+ <td align="center">1.6B</td>
107
+ <td><a href="https://huggingface.co/MonetLLM/visionmonet-vd-1.4B-100BT-hf">visionmonet-vd-1.4B-100BT-hf</a></td>
108
+ </tr>
109
+ </table>
110
+
111
+ ## Evaluation
112
+
113
+ ### Open-Ended LLM Benchmarks
114
+ <table>
115
+ <thead>
116
+ <th>Model</th><th>MMLU</th><th>ARC</th><th>WG</th><th>PIQA</th><th>SIQA</th><th>OBQA</th><th>HS</th><th>CSQA</th><th>Avg.</th>
117
+ </thead>
118
+ <tbody>
119
+ <tr><td colspan="10" align="center"><b>0-shot</b></td></tr>
120
+ <tr><td align="center"><b>Monet-HD 850M</b></td><td align="center">0.320</td><td align="center">0.460</td><td align="center">0.506</td><td align="center">0.699</td><td align="center">0.416</td><td align="center">0.364</td><td align="center">0.465</td><td align="center">0.337</td><td align="center">0.446</td></tr>
121
+ <tr><td align="center"><b>Monet-VD 850M</b></td><td align="center">0.328</td><td align="center">0.456</td><td align="center">0.530</td><td align="center">0.708</td><td align="center">0.417</td><td align="center">0.356</td><td align="center">0.488</td><td align="center">0.343</td><td align="center">0.453</td></tr>
122
+ <tr><td align="center"><b>Monet-HD 1.4B</b></td><td align="center">0.338</td><td align="center">0.471</td><td align="center">0.538</td><td align="center">0.714</td><td align="center">0.418</td><td align="center">0.382</td><td align="center">0.501</td><td align="center">0.339</td><td align="center">0.463</td></tr>
123
+ <tr><td align="center"><b>Monet-VD 1.4B</b></td><td align="center">0.352</td><td align="center">0.495</td><td align="center">0.522</td><td align="center">0.727</td><td align="center">0.423</td><td align="center">0.418</td><td align="center">0.529</td><td align="center">0.363</td><td align="center">0.478</td></tr>
124
+ <tr><td align="center"><b>Monet-HD 4.1B</b></td><td align="center">0.375</td><td align="center">0.558</td><td align="center">0.560</td><td align="center">0.741</td><td align="center">0.427</td><td align="center">0.414</td><td align="center">0.571</td><td align="center">0.379</td><td align="center">0.503</td></tr>
125
+ <tr><td align="center"><b>Monet-VD 4.1B</b></td><td align="center">0.380</td><td align="center">0.547</td><td align="center">0.557</td><td align="center">0.751</td><td align="center">0.437</td><td align="center">0.424</td><td align="center">0.604</td><td align="center">0.389</td><td align="center">0.511</td></tr>
126
+ <tr><td colspan="10" align="center"><b>5-shot</b></td></tr>
127
+ <tr><td align="center"><b>Monet-HD 850M</b></td><td align="center">0.332</td><td align="center">0.537</td><td align="center">0.510</td><td align="center">0.697</td><td align="center">0.409</td><td align="center">0.346</td><td align="center">0.479</td><td align="center">0.420</td><td align="center">0.466</td></tr>
128
+ <tr><td align="center"><b>Monet-VD 850M</b></td><td align="center">0.341</td><td align="center">0.548</td><td align="center">0.520</td><td align="center">0.709</td><td align="center">0.437</td><td align="center">0.368</td><td align="center">0.504</td><td align="center">0.454</td><td align="center">0.485</td></tr>
129
+ <tr><td align="center"><b>Monet-HD 1.4B</b></td><td align="center">0.352</td><td align="center">0.544</td><td align="center">0.530</td><td align="center">0.720</td><td align="center">0.432</td><td align="center">0.360</td><td align="center">0.518</td><td align="center">0.441</td><td align="center">0.487</td></tr>
130
+ <tr><td align="center"><b>Monet-VD 1.4B</b></td><td align="center">0.360</td><td align="center">0.547</td><td align="center">0.526</td><td align="center">0.730</td><td align="center">0.441</td><td align="center">0.422</td><td align="center">0.551</td><td align="center">0.501</td><td align="center">0.510</td></tr>
131
+ <tr><td align="center"><b>Monet-HD 4.1B</b></td><td align="center">0.385</td><td align="center">0.603</td><td align="center">0.545</td><td align="center">0.742</td><td align="center">0.463</td><td align="center">0.412</td><td align="center">0.588</td><td align="center">0.545</td><td align="center">0.535</td></tr>
132
+ <tr><td align="center"><b>Monet-VD 4.1B</b></td><td align="center">0.398</td><td align="center">0.625</td><td align="center">0.564</td><td align="center">0.761</td><td align="center">0.470</td><td align="center">0.438</td><td align="center">0.619</td><td align="center">0.525</td><td align="center">0.550</td></tr>
133
+ </tbody>
134
+ </table>
135
+
136
+ ### Detoxification
137
+
138
+ Detoxification task performances are evaluated on the [Monet-VD 1.4B](MonetLLM/monet-vd-1.4B-100BT-hf) model.
139
+
140
+ #### RealToxicityPrompts
141
+
142
+ <table>
143
+ <thead>
144
+ <tr>
145
+ <th rowspan="2">Masking<br/>Threshold</th>
146
+ <th rowspan="2">Masking<br/>Ratio</th>
147
+ <th colspan="2">Exp. Max. Toxicity</th>
148
+ <th colspan="2">Toxicity Prob.</th>
149
+ <th rowspan="2">Avg. Perf.</th>
150
+ </tr>
151
+ <tr>
152
+ <th>Toxic</th>
153
+ <th>Non-Toxic</th>
154
+ <th>Toxic</th>
155
+ <th>Non-Toxic</th>
156
+ </tr>
157
+ </thead>
158
+ <tbody>
159
+ <tr>
160
+ <td align="center">–</td>
161
+ <td align="center">–</td>
162
+ <td align="center">0.795</td>
163
+ <td align="center">0.269</td>
164
+ <td align="center">0.926</td>
165
+ <td align="center">0.08</td>
166
+ <td align="center"><b>0.478</b></td>
167
+ </tr>
168
+ <tr>
169
+ <td align="center">0.2</td>
170
+ <td align="center">1.0%</td>
171
+ <td align="center">0.767</td>
172
+ <td align="center">0.268</td>
173
+ <td align="center">0.909</td>
174
+ <td align="center">0.07</td>
175
+ <td align="center"><b>0.479</b></td>
176
+ </tr>
177
+ <tr>
178
+ <td align="center">0.1</td>
179
+ <td align="center">4.1%</td>
180
+ <td align="center">0.657</td>
181
+ <td align="center">0.270</td>
182
+ <td align="center">0.768</td>
183
+ <td align="center">0.08</td>
184
+ <td align="center"><b>0.478</b></td>
185
+ </tr>
186
+ <tr>
187
+ <td align="center">0.05</td>
188
+ <td align="center">14.4%</td>
189
+ <td align="center"><b>0.552</b></td>
190
+ <td align="center"><b>0.256</b></td>
191
+ <td align="center"><b>0.564</b></td>
192
+ <td align="center"><b>0.05</b></td>
193
+ <td align="center">0.467</td>
194
+ </tr>
195
+ </tbody>
196
+ </table>
197
+
198
+ #### ToxiGen
199
+ <table>
200
+ <thead>
201
+ <tr>
202
+ <th rowspan="2">Masking<br/>Threshold</th>
203
+ <th rowspan="2">Masking<br/>Ratio</th>
204
+ <th colspan="2">RoBERTa Score</th>
205
+ <th rowspan="2">Avg. Perf.</th>
206
+ </tr>
207
+ <tr>
208
+ <th>Hate</th>
209
+ <th>Neutral</th>
210
+ </tr>
211
+ </thead>
212
+ <tbody>
213
+ <tr>
214
+ <td align="center">–</td>
215
+ <td align="center">–</td>
216
+ <td align="center">0.642</td>
217
+ <td align="center">0.035</td>
218
+ <td align="center"><b>0.478</b></td>
219
+ </tr>
220
+ <tr>
221
+ <td align="center">0.2</td>
222
+ <td align="center">1.4%</td>
223
+ <td align="center">0.643</td>
224
+ <td align="center">0.033</td>
225
+ <td align="center"><b>0.478</b></td>
226
+ </tr>
227
+ <tr>
228
+ <td align="center">0.1</td>
229
+ <td align="center">5.4%</td>
230
+ <td align="center">0.504</td>
231
+ <td align="center">0.028</td>
232
+ <td align="center">0.473</td>
233
+ </tr>
234
+ <tr>
235
+ <td align="center">0.05</td>
236
+ <td align="center">15.0%</td>
237
+ <td align="center"><b>0.430</b></td>
238
+ <td align="center"><b>0.027</b></td>
239
+ <td align="center">0.455</td>
240
+ </tr>
241
+ </tbody>
242
+ </table>
243
+
244
+
245
+ ## Examples
246
+
247
+ ### Text Generation
248
+
249
+ ```python
250
+ from transformers import pipeline
251
+
252
+ model_name = "MonetLLM/monet-vd-1.4B-100BT-hf"
253
+ pipe = pipeline(
254
+ "text-generation",
255
+ model_name,
256
+ tokenizer=AutoTokenizer.from_pretrained(model_name),
257
+ torch_dtype=torch.bfloat16,
258
+ device_map="auto",
259
+ trust_remote_code=True,
260
+ )
261
+ print(pipe("The key to life is", max_new_tokens=20, do_sample=True)[0]["generated_text"])
262
+ ```
263
+
264
+ ### Code Generation
265
+
266
+ ```python
267
+ from transformers import pipeline
268
+
269
+ model_name = "MonetLLM/codemonet-vd-1.4B-100BT-hf"
270
+ pipe = pipeline(
271
+ "text-generation",
272
+ model_name,
273
+ tokenizer=AutoTokenizer.from_pretrained(model_name),
274
+ torch_dtype=torch.bfloat16,
275
+ device_map="auto",
276
+ trust_remote_code=True,
277
+ )
278
+
279
+ text = '''
280
+ def print_len(x: str):
281
+ """For a given string x, print the length of x."""
282
+ '''
283
+ print(pipe(text, max_new_tokens=10)[0]["generated_text"].split("\n\n")[0])
284
+ ```
285
+
286
+ ### Chat Completion
287
+
288
+ ```python
289
+ from transformers import pipeline
290
+
291
+ model_name = "MonetLLM/codemonet-vd-1.4B-100BT-chat-hf"
292
+ pipe = pipeline(
293
+ "text-generation",
294
+ model_name,
295
+ tokenizer=AutoTokenizer.from_pretrained(model_name),
296
+ torch_dtype=torch.bfloat16,
297
+ device_map="auto",
298
+ trust_remote_code=True,
299
+ )
300
+
301
+ text = tokenizer.apply_chat_template(
302
+ [{"role": "user", "content": "Hi! How are you?"}],
303
+ add_generation_prompt=True,
304
+ tokenize=False,
305
+ )
306
+ print(pipe(text, max_new_tokens=30, do_sample=True)[0]["generated_text"])
307
+ ```
308
+
309
+ ### Using vLLM
310
+
311
+ The custom implementation of vLLM is provided in [the repository](https://github.com/dmis-lab/Monet/blob/main/modeling_monet_vllm.py).
312
+
313
+ ```python
314
+ from vllm import LLM, ModelRegistry, SamplingParams
315
+ from modeling_monet_vllm import MonetForCausalLM
316
+
317
+ # Register Monet architecture with vLLM
318
+ ModelRegistry.register_model("MonetForCausalLM", MonetForCausalLM)
319
+
320
+ model = LLM(
321
+ "MonetLLM/monet-vd-1.4B-100BT-hf",
322
+ trust_remote_code=True,
323
+ dtype="bfloat16",
324
+ gpu_memory_utilization=0.8
325
+ )
326
+ sampling_params = SamplingParams(max_tokens=20, temperature=1.0)
327
+ print(model.generate("The key to life is", sampling_params)[0].outputs[0].text)
328
+ ```
329
+
330
+ ## Training
331
+ ### Model
332
+ - Architecture: Monet
333
+ - Pretraining tokens: 100B
334
+ - Precision: bfloat16
335
+ ### Hardware
336
+ - TPUs: TPU-v4-64 Pod Slice (supported by [TRC Program](https://sites.research.google/trc/about/))
337
+ ### Software
338
+ - Training Framework: [Jax](https://github.com/jax-ml/jax), [Flax](https://github.com/google/flax)
339
+
340
+ ## Intended Use
341
+
342
+ ### Primary Intended Uses
343
+ This model is designed to advance research on language models and serve as a foundational component for generative AI-driven functionalities. Its primary applications, mostly in English, include:
344
+
345
+ - Mechanistic interpretability research for language models
346
+ - Text generation with enhanced interpretability
347
+ - Code generation (CodeMonet variant)
348
+ - Chat completion (instruction-tuned variant)
349
+ - Vision-language tasks (VisionMonet variant)
350
+
351
+ ### Out-of-Scope Uses
352
+ This model has not been explicitly developed or tested for all potential downstream applications. Therefore:
353
+
354
+ 1. Limitations & Mitigations: Developers should be mindful of common language model limitations, and thoroughly evaluate and mitigate risks regarding accuracy, safety, and fairness—especially in high-stakes or high-risk scenarios.
355
+ 2. Legal & Regulatory Compliance: Developers must comply with any applicable laws and regulations (e.g., privacy, trade compliance), taking into account the model’s English-focused training (refer to <a href="https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu">FineWeb-Edu</a>).
356
+ 3. No License Modification: Nothing in this Model Card modifies or restricts the license under which this model is released.
357
+ 4. Unsupported Programming Languages: Programming in languages not covered by <a href="https://huggingface.co/datasets/bigcode/starcoderdata">StarCoderData</a>(CodeMonet variant) is not within the model’s intended scope.
358
+
359
+ ## Model Architecture
360
+
361
+ Monet introduces a novel Mixture-of-Experts (MoE) architecture with several key innovations:
362
+
363
+ - Parameter-efficient expert decomposition: overall parameter count grows in proportion to the square root of the number of experts
364
+ - Fine-grained expert specialization: offers clear insight into model behavior
365
+ - Precise manipulation of knowledge: enables control over domain knowledge, programming language capabilities, and toxicity level.
366
+
367
+ ## Ethical Considerations
368
+
369
+ ### Transparency
370
+ - Designed specifically for enhanced interpretability
371
+ - Enables understanding of internal model behavior
372
+ - Allows tracking of knowledge attribution
373
+
374
+ ### Control
375
+ - Supports toxicity mitigation
376
+ - Enables domain-specific knowledge control
377
+ - Maintains performance while adjusting behavior
378
+
379
+ ## License and Usage
380
+ Monet is licensed under the Apache 2.0 license. The model is primarily intended for research and educational use. Important licensing notes:
381
+
382
+ - Instruction-tuned models have been fine-tuned using a dataset mix with outputs generated from third party models
383
+ - Research and educational use is encouraged
384
+ - Commercial use is subject to Apache 2.0 license terms
385
+
386
+ ## Citation
387
+ ```bibtex
388
+ @article{park2024monet,
389
+ title={{Monet: Mixture of Monosemantic Experts for Transformers}},
390
+ author={Jungwoo Park and Young Jin Ahn and Kee-Eung Kim and Jaewoo Kang},
391
+ journal={arXiv preprint arXiv:2404.05567},
392
+ year={2024}
393
+ }
394
+ ```