`is_causal=False` marker

#22

This implementation is using non-causal attention.
For future correct behavior, is_causal=False marker will help identify this implementation detail.

thenlper changed pull request status to merged

Sign up or log in to comment