add note to use new model
Browse files
README.md
CHANGED
@@ -7,6 +7,9 @@ language:
|
|
7 |
pipeline_tag: fill-mask
|
8 |
---
|
9 |
|
|
|
|
|
|
|
10 |
# Moving Average Gated Attention (Mega): Pretrained LM
|
11 |
|
12 |
This repo contains pretrained weights for a language model with the Mega architecture (see [paper](https://arxiv.org/abs/2209.10655)).
|
|
|
7 |
pipeline_tag: fill-mask
|
8 |
---
|
9 |
|
10 |
+
**NOTE: THIS MODEL IS NOT INTEGRATED WITH HUGGING FACE**. Please use the version of this model converted to the newly implemented `Mega`
|
11 |
+
architecture in `transformers` ([link](https://huggingface.co/mnaylor/mega-base-wikitext))
|
12 |
+
|
13 |
# Moving Average Gated Attention (Mega): Pretrained LM
|
14 |
|
15 |
This repo contains pretrained weights for a language model with the Mega architecture (see [paper](https://arxiv.org/abs/2209.10655)).
|