ByT5: Towards a token-free future with pre-trained byte-to-byte models Paper • 2105.13626 • Published May 28, 2021 • 3
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published 12 days ago • 74