Alibaba-NLP
/

new-impl

Model card Files Files and versions

Resources

View closed (4)

[bugfix] Initialize attention bias on the same device as Query/Key/Value

#13 opened 2 months ago by

Problems encountered when loading the model for offline operation using SentenceTransformer

#12 opened 2 months ago by

Finetuning model performance decreases when using memory_efficient_attenttion

#11 opened 6 months ago by

Any Plan to release tensorflow based model?

#10 opened 9 months ago by

adding to transformers officially?

#9 opened 9 months ago by

Is flash-attention-2 suppported

#8 opened 11 months ago by

Xformer support for Qwen1.5B

#6 opened about 1 year ago by

backbone模型不开源吗？

#4 opened about 1 year ago by

Disable trust_remote_code

#2 opened over 1 year ago by