This is a tiny Jamba reward model used for development, debugging and experimentation over the Jamba architecture.

It has 319M parameters (instead of 52B in Jamba 1.5 Mini (and Jamba v0.1) and 398B in Jamba 1.5 Large), and was trained on ~40B tokens.

This model was created for unit testing purposes, by turning the first three rows of Jamba-tiny-dev's LM Head into a 3-attribute reward head. The bias was set to [1000, -1000, 0], so the outputs will be in that ballpark. Due to the way it was created, this model does not aim to provide value as a reward model.

Downloads last month
2,036
Safetensors
Model size
319M params
Tensor type
F32
·
BF16
·
Inference API
Unable to determine this model's library. Check the docs .