Forgetting Transformer: Softmax Attention with a Forget Gate Paper • 2503.02130 • Published 10 days ago • 26
WildChat-50m Collection All model responses associated with the WildChat-50m paper. • 55 items • Updated Jan 29 • 7
Korean-Adapted Model Series Collection Korean-adapted Language Model Series • 13 items • Updated May 17, 2024 • 27