mnaylor commited on
Commit
58183a3
·
1 Parent(s): fd598ef

add note to use new model

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -7,6 +7,9 @@ language:
7
  pipeline_tag: fill-mask
8
  ---
9
 
 
 
 
10
  # Moving Average Gated Attention (Mega): Pretrained LM
11
 
12
  This repo contains pretrained weights for a language model with the Mega architecture (see [paper](https://arxiv.org/abs/2209.10655)).
 
7
  pipeline_tag: fill-mask
8
  ---
9
 
10
+ **NOTE: THIS MODEL IS NOT INTEGRATED WITH HUGGING FACE**. Please use the version of this model converted to the newly implemented `Mega`
11
+ architecture in `transformers` ([link](https://huggingface.co/mnaylor/mega-base-wikitext))
12
+
13
  # Moving Average Gated Attention (Mega): Pretrained LM
14
 
15
  This repo contains pretrained weights for a language model with the Mega architecture (see [paper](https://arxiv.org/abs/2209.10655)).