robbie's picture

robbie

robb-0

·

AI & ML interests

Ditto!

Recent Activity

new activity about 1 hour ago

open-llm-leaderboard/open_llm_leaderboard:It's been a wild ride, folks :) (end of the Open LLM Leaderboard)

upvoted a paper about 3 hours ago

SpaceByte: Towards Deleting Tokenization from Large Language Modeling

upvoted a paper about 3 hours ago

Super Tiny Language Models

View all activity

Organizations

robb-0's activity

upvoted 3 papers about 3 hours ago

SpaceByte: Towards Deleting Tokenization from Large Language Modeling

Paper • 2404.14408 • Published Apr 22, 2024 • 7

Super Tiny Language Models

Paper • 2405.14159 • Published May 23, 2024 • 1

T-FREE: Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings

Paper • 2406.19223 • Published Jun 27, 2024 • 11

upvoted a paper 1 day ago

Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information

Paper • 2502.14258 • Published 25 days ago • 26

upvoted a collection 1 day ago

Foundation Text-Generation Models Below 360M Parameters

Great candidates for fine-tuning targeting Wllama and Transformers.js for mobile devices, ordered by number of parameters. • 35 items • Updated 1 day ago • 29

upvoted a paper 1 day ago

Finch: Prompt-guided Key-Value Cache Compression

Paper • 2408.00167 • Published Jul 31, 2024 • 17

upvoted a collection 1 day ago

Hallucination

14 items • Updated Jun 10, 2024 • 8

upvoted a paper 1 day ago

Transformers without Normalization

Paper • 2503.10622 • Published 3 days ago • 103

upvoted 2 papers 2 days ago

OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models

Paper • 2503.08686 • Published 5 days ago • 16

Charting and Navigating Hugging Face's Model Atlas

Paper • 2503.10633 • Published 3 days ago • 55

upvoted a paper 3 days ago

TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

Paper • 2411.15124 • Published Nov 22, 2024 • 60

upvoted 2 papers 5 days ago

Cheems: Wonderful Matrices More Efficient and More Effective Architecture

Paper • 2407.16958 • Published Jul 24, 2024 • 4

Wonderful Matrices: Combining for a More Efficient and Effective Foundation Model Architecture

Paper • 2412.11834 • Published Dec 16, 2024 • 8

upvoted a collection 5 days ago

🐕Small-Doges

Doge family of small language models! • 11 items • Updated 5 days ago • 3

upvoted 2 collections 6 days ago

story writing favourites

Models I personally liked for generating stories in the past. Not a recommendation, many of these are outdated. • 20 items • Updated 10 days ago • 50

Frequently Used Spaces

17 items • Updated 6 days ago • 5

upvoted an article 6 days ago

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16, 2024

• 333

upvoted 2 collections 6 days ago

Sparse Autoencoders

SAEs are tools for understanding the internal representations of neural networks. These can be loaded using https://github.com/EleutherAI/sae • 9 items • Updated 19 days ago • 3

Pythia Scaling Suite

Pythia is the first LLM suite designed specifically to enable scientific research on LLMs. To learn more see https://github.com/EleutherAI/pythia • 18 items • Updated 19 days ago • 29

upvoted a paper 6 days ago

Unveiling Downstream Performance Scaling of LLMs: A Clustering-Based Perspective

Paper • 2502.17262 • Published 21 days ago • 19