Zyphra
/

Zamba2-2.7B

Text Generation

Inference Endpoints

Model card Files Files and versions Community

BerenMillidge commited on Jul 28, 2024

Commit

033b404

·

verified ·

1 Parent(s): 9570f91

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -3,7 +3,7 @@ license: apache-2.0
 ---
 # Model Card for Zamba v2 2.7B
-Zamba-2-2.7B is a hybrid model between state-space models and transformers. It broadly follows the [Zamba architecture](https://arxiv.org/abs/2405.16712) which consists of a Mamba backbone alternating with shared transformer blocks. Zamba-2-2.7B possesses three major improvements over Zamba1:
 1.) Mamba1 blocks have been replaced with Mamba2 blocks.
 2.) Instead of a single shared attention block, we utilize two shared attention blocks which are interleaved in an ABAB pattern through the network.

 ---
 # Model Card for Zamba v2 2.7B
+Zamba2-2.7B is a hybrid model between state-space models and transformers. It broadly follows the [Zamba architecture](https://arxiv.org/abs/2405.16712) which consists of a Mamba backbone alternating with shared transformer blocks. Zamba-2-2.7B possesses three major improvements over Zamba1:
 1.) Mamba1 blocks have been replaced with Mamba2 blocks.
 2.) Instead of a single shared attention block, we utilize two shared attention blocks which are interleaved in an ABAB pattern through the network.