Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
65.1
TFLOPS
2
5
15
wenyang
notoookay
Follow
omnisson's profile picture
1 follower
·
4 following
wenyang_t
notoookay
AI & ML interests
NLP, RL
Recent Activity
liked
a dataset
2 days ago
BAAI/CCI3-HQ
replied
to
singhsidhukuldeep
's
post
2 days ago
Exciting breakthrough in AI: @Meta's new Byte Latent Transformer (BLT) revolutionizes language models by eliminating tokenization! The BLT architecture introduces a groundbreaking approach that processes raw bytes instead of tokens, achieving state-of-the-art performance while being more efficient and robust. Here's what makes it special: >> Key Innovations Dynamic Patching: BLT groups bytes into variable-sized patches based on entropy, allocating more compute power where the data is more complex. This results in up to 50% fewer FLOPs during inference compared to traditional token-based models. Three-Component Architecture: • Lightweight Local Encoder that converts bytes to patch representations • Powerful Global Latent Transformer that processes patches • Local Decoder that converts patches back to bytes >> Technical Advantages • Matches performance of Llama 3 at 8B parameters while being more efficient • Superior handling of non-English languages and rare character sequences • Remarkable 99.9% accuracy on spelling tasks • Better scaling properties than token-based models >> Under the Hood The system uses an entropy model to determine patch boundaries, cross-attention mechanisms for information flow, and hash n-gram embeddings for improved representation. The architecture allows simultaneous scaling of both patch and model size while maintaining fixed inference costs. This is a game-changer for multilingual AI and could reshape how we build future language models. Excited to see how this technology evolves!
liked
a Space
5 days ago
data-agents/jupyter-agent
View all activity
Organizations
notoookay
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
liked
a dataset
2 days ago
BAAI/CCI3-HQ
Viewer
•
Updated
Nov 11
•
54.8M
•
8.42k
•
32
replied
to
singhsidhukuldeep
's
post
2 days ago
view reply
Got it.
https://arxiv.org/abs/2412.09871
liked
a Space
5 days ago
Running
97
🏃
Jupyter Agent
Load more