EleutherAI

non-profit

Verified

https://eleuther.ai

AIEleuther

EleutherAI

Activity Feed Request to join this org

AI & ML interests

Large language models, scaling laws, AI Alignment, democratization of DL

Recent Activity

Skylion007 authored a paper 14 days ago

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

hyunwoongko authored a paper 29 days ago

Kanana: Compute-efficient Bilingual Language Models

pietrolesci authored a paper 29 days ago

Self-Training Large Language Models for Tool-Use Without Demonstrations

View all activity

EleutherAI's activity

luciaquirke

published a dataset 8 days ago

EleutherAI/SmolLM2-135M-100B

Updated 8 days ago • 15

luciaquirke

updated a dataset 8 days ago

EleutherAI/SmolLM2-135M-10B

Viewer • Updated 8 days ago • 10.7M • 251

luciaquirke

published a dataset 8 days ago

EleutherAI/SmolLM2-135M-10B

Viewer • Updated 8 days ago • 10.7M • 251

Skylion007

authored a paper 14 days ago

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published 16 days ago • 62

hyunwoongko

authored a paper 29 days ago

Kanana: Compute-efficient Bilingual Language Models

Paper • 2502.18934 • Published about 1 month ago • 64

pietrolesci

authored a paper 29 days ago

Self-Training Large Language Models for Tool-Use Without Demonstrations

Paper • 2502.05867 • Published Feb 9

bzantium

authored a paper 29 days ago

Kanana: Compute-efficient Bilingual Language Models

Paper • 2502.18934 • Published about 1 month ago • 64

avi-skowron

authored a paper about 1 month ago

Beyond Release: Access Considerations for Generative AI Systems

Paper • 2502.16701 • Published Feb 23 • 12

amphora

authored a paper about 1 month ago

Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning

Paper • 2502.17407 • Published Feb 24 • 25

craffel

authored a paper about 2 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 215

stellaathena

authored a paper about 2 months ago

Open Problems in Mechanistic Interpretability

Paper • 2501.16496 • Published Jan 27 • 19

storytracer

authored a paper 2 months ago

Towards Best Practices for Open Datasets for LLM Training

Paper • 2501.08365 • Published Jan 14 • 60

avi-skowron

authored a paper 2 months ago

Towards Best Practices for Open Datasets for LLM Training

Paper • 2501.08365 • Published Jan 14 • 60

stellaathena

authored a paper 2 months ago

Towards Best Practices for Open Datasets for LLM Training

Paper • 2501.08365 • Published Jan 14 • 60

Skylion007

authored a paper 3 months ago

The GAN is dead; long live the GAN! A Modern GAN Baseline

Paper • 2501.05441 • Published Jan 9 • 91

ncoop57

authored 2 papers 3 months ago

Stable Code Technical Report

Paper • 2404.01226 • Published Apr 1, 2024 • 1

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 139

akhaliq

posted an update 3 months ago

Post

14807

Google drops Gemini 2.0 Flash Thinking

a new experimental model that unlocks stronger reasoning capabilities and shows its thoughts. The model plans (with thoughts visible), can solve complex problems with Flash speeds, and more

now available in anychat, try it out: akhaliq/anychat

3 replies

·

baber

authored a paper 4 months ago

Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models

Paper • 2412.02980 • Published Dec 4, 2024 • 14

akhaliq

posted an update 4 months ago

Post

15044

QwQ-32B-Preview is now available in anychat

A reasoning model that is competitive with OpenAI o1-mini and o1-preview

try it out: akhaliq/anychat

1 reply

·