Text Generation
Transformers
Safetensors
zamba2
Inference Endpoints
BerenMillidge commited on
Commit
033b404
·
verified ·
1 Parent(s): 9570f91

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -3,7 +3,7 @@ license: apache-2.0
3
  ---
4
  # Model Card for Zamba v2 2.7B
5
 
6
- Zamba-2-2.7B is a hybrid model between state-space models and transformers. It broadly follows the [Zamba architecture](https://arxiv.org/abs/2405.16712) which consists of a Mamba backbone alternating with shared transformer blocks. Zamba-2-2.7B possesses three major improvements over Zamba1:
7
 
8
  1.) Mamba1 blocks have been replaced with Mamba2 blocks.
9
  2.) Instead of a single shared attention block, we utilize two shared attention blocks which are interleaved in an ABAB pattern through the network.
 
3
  ---
4
  # Model Card for Zamba v2 2.7B
5
 
6
+ Zamba2-2.7B is a hybrid model between state-space models and transformers. It broadly follows the [Zamba architecture](https://arxiv.org/abs/2405.16712) which consists of a Mamba backbone alternating with shared transformer blocks. Zamba-2-2.7B possesses three major improvements over Zamba1:
7
 
8
  1.) Mamba1 blocks have been replaced with Mamba2 blocks.
9
  2.) Instead of a single shared attention block, we utilize two shared attention blocks which are interleaved in an ABAB pattern through the network.