BerenMillidge
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -3,7 +3,7 @@ license: apache-2.0
|
|
3 |
---
|
4 |
# Model Card for Zamba v2 2.7B
|
5 |
|
6 |
-
|
7 |
|
8 |
1.) Mamba1 blocks have been replaced with Mamba2 blocks.
|
9 |
2.) Instead of a single shared attention block, we utilize two shared attention blocks which are interleaved in an ABAB pattern through the network.
|
|
|
3 |
---
|
4 |
# Model Card for Zamba v2 2.7B
|
5 |
|
6 |
+
Zamba2-2.7B is a hybrid model between state-space models and transformers. It broadly follows the [Zamba architecture](https://arxiv.org/abs/2405.16712) which consists of a Mamba backbone alternating with shared transformer blocks. Zamba-2-2.7B possesses three major improvements over Zamba1:
|
7 |
|
8 |
1.) Mamba1 blocks have been replaced with Mamba2 blocks.
|
9 |
2.) Instead of a single shared attention block, we utilize two shared attention blocks which are interleaved in an ABAB pattern through the network.
|