Dan Fu
commited on
Commit
•
a512f44
1
Parent(s):
4f7610c
8k retrieval
Browse files
README.md
CHANGED
@@ -8,7 +8,7 @@ pipeline_tag: text-classification
|
|
8 |
# Monarch Mixer-BERT
|
9 |
|
10 |
The 80M checkpoint for M2-BERT-base from the paper [Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture](https://arxiv.org/abs/2310.12109).
|
11 |
-
This model has been pretrained with sequence length
|
12 |
|
13 |
This model was trained by Dan Fu, Jon Saad-Falcon, and Simran Arora.
|
14 |
|
|
|
8 |
# Monarch Mixer-BERT
|
9 |
|
10 |
The 80M checkpoint for M2-BERT-base from the paper [Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture](https://arxiv.org/abs/2310.12109).
|
11 |
+
This model has been pretrained with sequence length 8192, and it has been fine-tuned for retrieval.
|
12 |
|
13 |
This model was trained by Dan Fu, Jon Saad-Falcon, and Simran Arora.
|
14 |
|