Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
Longformer and reformer are models that try to be more efficient and
use a sparse version of the attention matrix to speed up training.