mnaylor
/

mega-wikitext-103

Model card Files Files and versions Community

mnaylor commited on Feb 23, 2023

Commit

58183a3

·

1 Parent(s): fd598ef

add note to use new model

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -7,6 +7,9 @@ language:
 pipeline_tag: fill-mask
 ---
 # Moving Average Gated Attention (Mega): Pretrained LM
 This repo contains pretrained weights for a language model with the Mega architecture (see [paper](https://arxiv.org/abs/2209.10655)).

 pipeline_tag: fill-mask
 ---
+**NOTE: THIS MODEL IS NOT INTEGRATED WITH HUGGING FACE**. Please use the version of this model converted to the newly implemented `Mega`
+architecture in `transformers` ([link](https://huggingface.co/mnaylor/mega-base-wikitext))
 # Moving Average Gated Attention (Mega): Pretrained LM
 This repo contains pretrained weights for a language model with the Mega architecture (see [paper](https://arxiv.org/abs/2209.10655)).