Bug in attention map computation

#3
by gionii - opened

In the following line: https://huggingface.co/Synthyra/ESMplusplus_small/blob/main/modeling_esm_plusplus.py#L324, you are updating attention_mask rather than attn_bias which is actually used to mask attention values.

I am assuming you followed this template https://pytorch.org/docs/stable/generated/torch.nn.functional.scaled_dot_product_attention.html

Synthyra org

Hi @gionii ,

Thanks for pointing this out! We have fixed the typo.

If you have any other questions or comments please let me know.
Best,
Logan

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment