Elie Bakouch's picture

Elie Bakouch

eliebak

·

AI & ML interests

Training LLM's @ 🤗

Recent Activity

upvoted an article about 19 hours ago

🌁#81: Key AI Concepts to Follow in 2025

liked a model 4 days ago

answerdotai/ModernBERT-base

upvoted a paper 5 days ago

Qwen2.5 Technical Report

View all activity

Articles

SmolVLM - small yet mighty Vision Language Model

SmolLM - blazingly fast and remarkably powerful

Organizations

eliebak's activity

upvoted an article about 19 hours ago

Article

🌁#81: Key AI Concepts to Follow in 2025

By

•

1 day ago

• 11

upvoted a paper 5 days ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published 6 days ago • 325

upvoted 2 papers about 1 month ago

RedPajama: an Open Dataset for Training Large Language Models

Paper • 2411.12372 • Published Nov 19 • 47

Balancing Pipeline Parallelism with Vocabulary Parallelism

Paper • 2411.05288 • Published Nov 8 • 19

upvoted a collection 2 months ago

LoLCATS

Linearizing LLMs with high quality and efficiency. We linearize the full Llama 3.1 model family -- 8b, 70b, 405b -- for the first time! • 4 items • Updated Oct 14 • 14

upvoted 3 papers 3 months ago

Differential Transformer

Paper • 2410.05258 • Published Oct 7 • 168

EuroLLM: Multilingual Language Models for Europe

Paper • 2409.16235 • Published Sep 24 • 25

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Paper • 2409.12183 • Published Sep 18 • 36

upvoted an article 3 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18

• 213

upvoted 2 papers 4 months ago

Enhancing Training Efficiency Using Packing with Flash Attention

Paper • 2407.09105 • Published Jul 12 • 14

Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler

Paper • 2408.13359 • Published Aug 23 • 22

upvoted a collection 4 months ago

Power-LM

Dense & MoE LLMs trained with power learning rate scheduler. • 4 items • Updated Oct 17 • 15

upvoted 2 articles 4 months ago

Article

Introducing AuraFace: Open-Source Face Recognition and Identity Preservation Models

By

•

Aug 26

• 37

Article

MicroJAX

By

•

Aug 25

• 17

upvoted a paper 4 months ago

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20 • 41

upvoted 2 collections 4 months ago

💻 Local SmolLMs

SmolLM models in MLC, ONNX and GGUF format for local applications + in-browser demos • 14 items • Updated 2 days ago • 46

🪐 SmolLM

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated 2 days ago • 204

upvoted an article 4 months ago

Article

A failed experiment: Infini-Attention, and why we should keep trying?

Aug 14

• 53

upvoted a paper 4 months ago

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

Paper • 2408.07055 • Published Aug 13 • 64

upvoted an article 4 months ago

Article

Introduction to ggml

Aug 13

• 120