phi-1_5 / configuration_mixformer_sequential.py

Commit History

Improves type hinting on configuration arguments.
92557d0

gugarosa commited on

Enables to toggle fused_dense, flash_rotary and attn_pdrop in the configuration.
45f4b21

gugarosa commited on

Adds support for MQA/GQA and attention mask during training.
de35f90

gugarosa commited on

Support for `attention_mask` in forward pass.
3128bb6

gugarosa commited on

Upload MixFormerSequentialForCausalLM
1698206

suriyagunasekar commited on