|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
library_name: transformers |
|
--- |
|
|
|
# Monet: Mixture of Monosemantic Experts for Transformers |
|
|
|
## Model Summary |
|
|
|
Monet introduces a novel approach for improving mechanistic interpretability in large language models (LLMs) using a Sparse Mixture-of-Experts (SMoE) architecture with 262,144 experts. By integrating sparse dictionary learning directly into end-to-end pretraining, Monet tackles the core issue of polysemanticity—where single neurons encode multiple unrelated concepts—while preserving overall model performance. |
|
|
|
|
|
### Resources and Technical Documentation |
|
|
|
- **GitHub Repository**: https://github.com/dmis-lab/Monet |
|
- **Paper**: https://arxiv.org/abs/2412.04139 |
|
- **Model Hub**: https://huggingface.co/MonetLLM |
|
- **Demo**: https://huggingface.co/spaces/MonetLLM/monet-vd-1.4B-100BT-hf-viewer |
|
|
|
### Available Checkpoints |
|
|
|
#### Base Models |
|
|
|
|
|
<table class="center"> |
|
<tr> |
|
<td align="center"><b>Model</b></td> |
|
<td align="center"><b>Dataset</b></td> |
|
<td align="center"><b>#Params</b></td> |
|
<td align="center"><b>#Tokens</b></td> |
|
<td align="center"><b>Checkpoint</b></td> |
|
<td align="center"><b>Demo</b></td> |
|
</tr> |
|
<tr> |
|
<td align="center" rowspan="4"><b>Monet-VD</b></td> |
|
<td align="center" rowspan="3"><a href="https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu">FineWeb-Edu</a></td> |
|
<td align="center">850M</td> |
|
<td align="center">100BT</td> |
|
<td><a href="https://huggingface.co/MonetLLM/monet-vd-850M-100BT-hf">monet-vd-850M-100BT-hf</a></td> |
|
<td></td> |
|
</tr> |
|
<tr> |
|
<td align="center">1.4B</td> |
|
<td align="center">100BT</td> |
|
<td><a href="https://huggingface.co/MonetLLM/monet-vd-1.4B-100BT-hf">monet-vd-1.4B-100BT-hf</a></td> |
|
<td><a href="https://huggingface.co/spaces/MonetLLM/monet-vd-1.4B-100BT-hf-viewer">Viewer</a></td> |
|
</tr> |
|
<tr> |
|
<td align="center">4.1B</td> |
|
<td align="center">100BT</td> |
|
<td><a href="https://huggingface.co/MonetLLM/monet-vd-4.1B-100BT-hf">monet-vd-4.1B-100BT-hf</a></td> |
|
<td></td> |
|
</tr> |
|
<tr> |
|
<td align="center"><a href="https://huggingface.co/datasets/bigcode/starcoderdata">StarCoderData</a></td> |
|
<td align="center">1.4B</td> |
|
<td align="center">100BT</td> |
|
<td><a href="https://huggingface.co/MonetLLM/codemonet-vd-1.4B-100BT-hf">codemonet-vd-1.4B-100BT-hf</a></td> |
|
<td><a href="https://huggingface.co/spaces/MonetLLM/codemonet-vd-1.4B-100BT-hf-viewer">Viewer</a></td> |
|
</tr> |
|
<tr> |
|
<td align="center" rowspan="3"><b>Monet-HD</b></td> |
|
<td align="center" rowspan="3"><a href="https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu">FineWeb-Edu</a></td> |
|
<td align="center">850M</td> |
|
<td align="center">100BT</td> |
|
<td><a href="https://huggingface.co/MonetLLM/monet-hd-850M-100BT-hf">monet-hd-850M-100BT-hf</a></td> |
|
<td></td> |
|
</tr> |
|
<tr> |
|
<td align="center">1.4B</td> |
|
<td align="center">100BT</td> |
|
<td><a href="https://huggingface.co/MonetLLM/monet-hd-1.4B-100BT-hf">monet-hd-1.4B-100BT-hf</a></td> |
|
<td></td> |
|
</tr> |
|
<tr> |
|
<td align="center">4.1B</td> |
|
<td align="center">100BT</td> |
|
<td><a href="https://huggingface.co/MonetLLM/monet-hd-4.1B-100BT-hf">monet-hd-4.1B-100BT-hf</a></td> |
|
<td></td> |
|
</tr> |
|
</table> |
|
|
|
#### Instruction-Tuned Models |
|
|
|
<table class="center"> |
|
<tr> |
|
<td align="center"><b>Model</b></td> |
|
<td align="center"><b>Purpose</b></td> |
|
<td align="center"><b>Recipe</b></td> |
|
<td align="center"><b>#Params</b></td> |
|
<td align="center"><b>Checkpoint</b></td> |
|
</tr> |
|
<tr> |
|
<td align="center" rowspan="2"><b>Monet-VD</b></td> |
|
<td align="center">Chat Completion</td> |
|
<td align="center"><a href="https://github.com/huggingface/alignment-handbook/tree/main/recipes/smollm">SmolLM</a></td> |
|
<td align="center">1.4B</td> |
|
<td><a href="https://huggingface.co/MonetLLM/monet-vd-1.4B-100BT-chat-hf">monet-vd-1.4B-100BT-chat-hf</a></td> |
|
</tr> |
|
<tr> |
|
<td align="center">Vision-Language Model</td> |
|
<td align="center"><a href="https://github.com/haotian-liu/LLaVA">LLaVA</a></td> |
|
<td align="center">1.6B</td> |
|
<td><a href="https://huggingface.co/MonetLLM/visionmonet-vd-1.4B-100BT-hf">visionmonet-vd-1.4B-100BT-hf</a></td> |
|
</tr> |
|
</table> |
|
|
|
## Evaluation |
|
|
|
### Open-Ended LLM Benchmarks |
|
<table> |
|
<thead> |
|
<th>Model</th><th>MMLU</th><th>ARC</th><th>WG</th><th>PIQA</th><th>SIQA</th><th>OBQA</th><th>HS</th><th>CSQA</th><th>Avg.</th> |
|
</thead> |
|
<tbody> |
|
<tr><td colspan="10" align="center"><b>0-shot</b></td></tr> |
|
<tr><td align="center"><b>Monet-HD 850M</b></td><td align="center">0.320</td><td align="center">0.460</td><td align="center">0.506</td><td align="center">0.699</td><td align="center">0.416</td><td align="center">0.364</td><td align="center">0.465</td><td align="center">0.337</td><td align="center">0.446</td></tr> |
|
<tr><td align="center"><b>Monet-VD 850M</b></td><td align="center">0.328</td><td align="center">0.456</td><td align="center">0.530</td><td align="center">0.708</td><td align="center">0.417</td><td align="center">0.356</td><td align="center">0.488</td><td align="center">0.343</td><td align="center">0.453</td></tr> |
|
<tr><td align="center"><b>Monet-HD 1.4B</b></td><td align="center">0.338</td><td align="center">0.471</td><td align="center">0.538</td><td align="center">0.714</td><td align="center">0.418</td><td align="center">0.382</td><td align="center">0.501</td><td align="center">0.339</td><td align="center">0.463</td></tr> |
|
<tr><td align="center"><b>Monet-VD 1.4B</b></td><td align="center">0.352</td><td align="center">0.495</td><td align="center">0.522</td><td align="center">0.727</td><td align="center">0.423</td><td align="center">0.418</td><td align="center">0.529</td><td align="center">0.363</td><td align="center">0.478</td></tr> |
|
<tr><td align="center"><b>Monet-HD 4.1B</b></td><td align="center">0.375</td><td align="center">0.558</td><td align="center">0.560</td><td align="center">0.741</td><td align="center">0.427</td><td align="center">0.414</td><td align="center">0.571</td><td align="center">0.379</td><td align="center">0.503</td></tr> |
|
<tr><td align="center"><b>Monet-VD 4.1B</b></td><td align="center">0.380</td><td align="center">0.547</td><td align="center">0.557</td><td align="center">0.751</td><td align="center">0.437</td><td align="center">0.424</td><td align="center">0.604</td><td align="center">0.389</td><td align="center">0.511</td></tr> |
|
<tr><td colspan="10" align="center"><b>5-shot</b></td></tr> |
|
<tr><td align="center"><b>Monet-HD 850M</b></td><td align="center">0.332</td><td align="center">0.537</td><td align="center">0.510</td><td align="center">0.697</td><td align="center">0.409</td><td align="center">0.346</td><td align="center">0.479</td><td align="center">0.420</td><td align="center">0.466</td></tr> |
|
<tr><td align="center"><b>Monet-VD 850M</b></td><td align="center">0.341</td><td align="center">0.548</td><td align="center">0.520</td><td align="center">0.709</td><td align="center">0.437</td><td align="center">0.368</td><td align="center">0.504</td><td align="center">0.454</td><td align="center">0.485</td></tr> |
|
<tr><td align="center"><b>Monet-HD 1.4B</b></td><td align="center">0.352</td><td align="center">0.544</td><td align="center">0.530</td><td align="center">0.720</td><td align="center">0.432</td><td align="center">0.360</td><td align="center">0.518</td><td align="center">0.441</td><td align="center">0.487</td></tr> |
|
<tr><td align="center"><b>Monet-VD 1.4B</b></td><td align="center">0.360</td><td align="center">0.547</td><td align="center">0.526</td><td align="center">0.730</td><td align="center">0.441</td><td align="center">0.422</td><td align="center">0.551</td><td align="center">0.501</td><td align="center">0.510</td></tr> |
|
<tr><td align="center"><b>Monet-HD 4.1B</b></td><td align="center">0.385</td><td align="center">0.603</td><td align="center">0.545</td><td align="center">0.742</td><td align="center">0.463</td><td align="center">0.412</td><td align="center">0.588</td><td align="center">0.545</td><td align="center">0.535</td></tr> |
|
<tr><td align="center"><b>Monet-VD 4.1B</b></td><td align="center">0.398</td><td align="center">0.625</td><td align="center">0.564</td><td align="center">0.761</td><td align="center">0.470</td><td align="center">0.438</td><td align="center">0.619</td><td align="center">0.525</td><td align="center">0.550</td></tr> |
|
</tbody> |
|
</table> |
|
|
|
### Detoxification |
|
|
|
Detoxification task performances are evaluated on the [Monet-VD 1.4B](MonetLLM/monet-vd-1.4B-100BT-hf) model. |
|
|
|
#### RealToxicityPrompts |
|
|
|
<table> |
|
<thead> |
|
<tr> |
|
<th rowspan="2">Masking<br/>Threshold</th> |
|
<th rowspan="2">Masking<br/>Ratio</th> |
|
<th colspan="2">Exp. Max. Toxicity</th> |
|
<th colspan="2">Toxicity Prob.</th> |
|
<th rowspan="2">Avg. Perf.</th> |
|
</tr> |
|
<tr> |
|
<th>Toxic</th> |
|
<th>Non-Toxic</th> |
|
<th>Toxic</th> |
|
<th>Non-Toxic</th> |
|
</tr> |
|
</thead> |
|
<tbody> |
|
<tr> |
|
<td align="center">–</td> |
|
<td align="center">–</td> |
|
<td align="center">0.795</td> |
|
<td align="center">0.269</td> |
|
<td align="center">0.926</td> |
|
<td align="center">0.08</td> |
|
<td align="center"><b>0.478</b></td> |
|
</tr> |
|
<tr> |
|
<td align="center">0.2</td> |
|
<td align="center">1.0%</td> |
|
<td align="center">0.767</td> |
|
<td align="center">0.268</td> |
|
<td align="center">0.909</td> |
|
<td align="center">0.07</td> |
|
<td align="center"><b>0.479</b></td> |
|
</tr> |
|
<tr> |
|
<td align="center">0.1</td> |
|
<td align="center">4.1%</td> |
|
<td align="center">0.657</td> |
|
<td align="center">0.270</td> |
|
<td align="center">0.768</td> |
|
<td align="center">0.08</td> |
|
<td align="center"><b>0.478</b></td> |
|
</tr> |
|
<tr> |
|
<td align="center">0.05</td> |
|
<td align="center">14.4%</td> |
|
<td align="center"><b>0.552</b></td> |
|
<td align="center"><b>0.256</b></td> |
|
<td align="center"><b>0.564</b></td> |
|
<td align="center"><b>0.05</b></td> |
|
<td align="center">0.467</td> |
|
</tr> |
|
</tbody> |
|
</table> |
|
|
|
#### ToxiGen |
|
<table> |
|
<thead> |
|
<tr> |
|
<th rowspan="2">Masking<br/>Threshold</th> |
|
<th rowspan="2">Masking<br/>Ratio</th> |
|
<th colspan="2">RoBERTa Score</th> |
|
<th rowspan="2">Avg. Perf.</th> |
|
</tr> |
|
<tr> |
|
<th>Hate</th> |
|
<th>Neutral</th> |
|
</tr> |
|
</thead> |
|
<tbody> |
|
<tr> |
|
<td align="center">–</td> |
|
<td align="center">–</td> |
|
<td align="center">0.642</td> |
|
<td align="center">0.035</td> |
|
<td align="center"><b>0.478</b></td> |
|
</tr> |
|
<tr> |
|
<td align="center">0.2</td> |
|
<td align="center">1.4%</td> |
|
<td align="center">0.643</td> |
|
<td align="center">0.033</td> |
|
<td align="center"><b>0.478</b></td> |
|
</tr> |
|
<tr> |
|
<td align="center">0.1</td> |
|
<td align="center">5.4%</td> |
|
<td align="center">0.504</td> |
|
<td align="center">0.028</td> |
|
<td align="center">0.473</td> |
|
</tr> |
|
<tr> |
|
<td align="center">0.05</td> |
|
<td align="center">15.0%</td> |
|
<td align="center"><b>0.430</b></td> |
|
<td align="center"><b>0.027</b></td> |
|
<td align="center">0.455</td> |
|
</tr> |
|
</tbody> |
|
</table> |
|
|
|
|
|
## Examples |
|
|
|
### Text Generation |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
model_name = "MonetLLM/monet-vd-1.4B-100BT-hf" |
|
pipe = pipeline( |
|
"text-generation", |
|
model_name, |
|
tokenizer=AutoTokenizer.from_pretrained(model_name), |
|
torch_dtype=torch.bfloat16, |
|
device_map="auto", |
|
trust_remote_code=True, |
|
) |
|
print(pipe("The key to life is", max_new_tokens=20, do_sample=True)[0]["generated_text"]) |
|
``` |
|
|
|
### Code Generation |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
model_name = "MonetLLM/codemonet-vd-1.4B-100BT-hf" |
|
pipe = pipeline( |
|
"text-generation", |
|
model_name, |
|
tokenizer=AutoTokenizer.from_pretrained(model_name), |
|
torch_dtype=torch.bfloat16, |
|
device_map="auto", |
|
trust_remote_code=True, |
|
) |
|
|
|
text = ''' |
|
def print_len(x: str): |
|
"""For a given string x, print the length of x.""" |
|
''' |
|
print(pipe(text, max_new_tokens=10)[0]["generated_text"].split("\n\n")[0]) |
|
``` |
|
|
|
### Chat Completion |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
model_name = "MonetLLM/codemonet-vd-1.4B-100BT-chat-hf" |
|
pipe = pipeline( |
|
"text-generation", |
|
model_name, |
|
tokenizer=AutoTokenizer.from_pretrained(model_name), |
|
torch_dtype=torch.bfloat16, |
|
device_map="auto", |
|
trust_remote_code=True, |
|
) |
|
|
|
text = tokenizer.apply_chat_template( |
|
[{"role": "user", "content": "Hi! How are you?"}], |
|
add_generation_prompt=True, |
|
tokenize=False, |
|
) |
|
print(pipe(text, max_new_tokens=30, do_sample=True)[0]["generated_text"]) |
|
``` |
|
|
|
### Using vLLM |
|
|
|
The custom implementation of vLLM is provided in [the repository](https://github.com/dmis-lab/Monet/blob/main/modeling_monet_vllm.py). |
|
|
|
```python |
|
from vllm import LLM, ModelRegistry, SamplingParams |
|
from modeling_monet_vllm import MonetForCausalLM |
|
|
|
# Register Monet architecture with vLLM |
|
ModelRegistry.register_model("MonetForCausalLM", MonetForCausalLM) |
|
|
|
model = LLM( |
|
"MonetLLM/monet-vd-1.4B-100BT-hf", |
|
trust_remote_code=True, |
|
dtype="bfloat16", |
|
gpu_memory_utilization=0.8 |
|
) |
|
sampling_params = SamplingParams(max_tokens=20, temperature=1.0) |
|
print(model.generate("The key to life is", sampling_params)[0].outputs[0].text) |
|
``` |
|
|
|
## Training |
|
### Model |
|
- Architecture: Monet |
|
- Pretraining tokens: 100B |
|
- Precision: bfloat16 |
|
### Hardware |
|
- TPUs: TPU-v4-64 Pod Slice (supported by [TRC Program](https://sites.research.google/trc/about/)) |
|
### Software |
|
- Training Framework: [Jax](https://github.com/jax-ml/jax), [Flax](https://github.com/google/flax) |
|
|
|
## Intended Use |
|
|
|
### Primary Intended Uses |
|
This model is designed to advance research on language models and serve as a foundational component for generative AI-driven functionalities. Its primary applications, mostly in English, include: |
|
|
|
- Mechanistic interpretability research for language models |
|
- Text generation with enhanced interpretability |
|
- Code generation (CodeMonet variant) |
|
- Chat completion (instruction-tuned variant) |
|
- Vision-language tasks (VisionMonet variant) |
|
|
|
### Out-of-Scope Uses |
|
This model has not been explicitly developed or tested for all potential downstream applications. Therefore: |
|
|
|
1. Limitations & Mitigations: Developers should be mindful of common language model limitations, and thoroughly evaluate and mitigate risks regarding accuracy, safety, and fairness—especially in high-stakes or high-risk scenarios. |
|
2. Legal & Regulatory Compliance: Developers must comply with any applicable laws and regulations (e.g., privacy, trade compliance), taking into account the model’s English-focused training (refer to <a href="https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu">FineWeb-Edu</a>). |
|
3. No License Modification: Nothing in this Model Card modifies or restricts the license under which this model is released. |
|
4. Unsupported Programming Languages: Programming in languages not covered by <a href="https://huggingface.co/datasets/bigcode/starcoderdata">StarCoderData</a>(CodeMonet variant) is not within the model’s intended scope. |
|
|
|
## Model Architecture |
|
|
|
Monet introduces a novel Mixture-of-Experts (MoE) architecture with several key innovations: |
|
|
|
- Parameter-efficient expert decomposition: overall parameter count grows in proportion to the square root of the number of experts |
|
- Fine-grained expert specialization: offers clear insight into model behavior |
|
- Precise manipulation of knowledge: enables control over domain knowledge, programming language capabilities, and toxicity level. |
|
|
|
## Ethical Considerations |
|
|
|
### Transparency |
|
- Designed specifically for enhanced interpretability |
|
- Enables understanding of internal model behavior |
|
- Allows tracking of knowledge attribution |
|
|
|
### Control |
|
- Supports toxicity mitigation |
|
- Enables domain-specific knowledge control |
|
- Maintains performance while adjusting behavior |
|
|
|
## License and Usage |
|
Monet is licensed under the Apache 2.0 license. The model is primarily intended for research and educational use. Important licensing notes: |
|
|
|
- Instruction-tuned models have been fine-tuned using a dataset mix with outputs generated from third party models |
|
- Research and educational use is encouraged |
|
- Commercial use is subject to Apache 2.0 license terms |
|
|
|
## Citation |
|
```bibtex |
|
@article{park2024monet, |
|
title={{Monet: Mixture of Monosemantic Experts for Transformers}}, |
|
author={Jungwoo Park and Young Jin Ahn and Kee-Eung Kim and Jaewoo Kang}, |
|
journal={arXiv preprint arXiv:2404.05567}, |
|
year={2024} |
|
} |
|
``` |