Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
11.8
TFLOPS
4
3
wb
whitebill
Follow
Mi6paulino's profile picture
21world's profile picture
2 followers
Β·
34 following
AI & ML interests
None yet
Recent Activity
reacted
to
singhsidhukuldeep
's
post
with π
4 days ago
Exciting breakthrough in AI: @Meta's new Byte Latent Transformer (BLT) revolutionizes language models by eliminating tokenization! The BLT architecture introduces a groundbreaking approach that processes raw bytes instead of tokens, achieving state-of-the-art performance while being more efficient and robust. Here's what makes it special: >> Key Innovations Dynamic Patching: BLT groups bytes into variable-sized patches based on entropy, allocating more compute power where the data is more complex. This results in up to 50% fewer FLOPs during inference compared to traditional token-based models. Three-Component Architecture: β’ Lightweight Local Encoder that converts bytes to patch representations β’ Powerful Global Latent Transformer that processes patches β’ Local Decoder that converts patches back to bytes >> Technical Advantages β’ Matches performance of Llama 3 at 8B parameters while being more efficient β’ Superior handling of non-English languages and rare character sequences β’ Remarkable 99.9% accuracy on spelling tasks β’ Better scaling properties than token-based models >> Under the Hood The system uses an entropy model to determine patch boundaries, cross-attention mechanisms for information flow, and hash n-gram embeddings for improved representation. The architecture allows simultaneous scaling of both patch and model size while maintaining fixed inference costs. This is a game-changer for multilingual AI and could reshape how we build future language models. Excited to see how this technology evolves!
updated
a collection
8 days ago
my1
reacted
to
clem
's
post
with π
8 days ago
Coming back to Paris Friday to open our new Hugging Face office! We're at capacity for the party but add your name in the waiting list as we're trying to privatize the passage du Caire for extra space for robots π€π¦Ύπ¦Ώ https://t.co/enkFXjWndJ
View all activity
Organizations
whitebill
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
liked
a Space
8 days ago
Running
2
π
Mi50
MI50 inference
liked
a model
about 2 months ago
ruslandev/llama-3-8b-gpt-4o
Text Generation
β’
Updated
Jun 12
β’
44
β’
2
liked
a Space
7 months ago
Runtime error
82
π
GEN VISION
High Detailed Image Generation : SDXL