Lucas's picture

Lucas

lckr

·

lckr

AI & ML interests

None yet

Organizations

lckr's activity

upvoted 2 collections 7 months ago

Online RLHF

Datasets, code, and models for online RLHF (i.e., iterative DPO) • 19 items • Updated Jun 12 • 4

Phi-3

Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 26 items • Updated Nov 14 • 537

upvoted a collection 8 months ago

Llama3-ChatQA-1.5

Llama3-ChatQA-1.5 models excel at conversational question answering (QA) and retrieval-augmented generation (RAG). • 6 items • Updated Oct 1 • 41

upvoted 17 papers about 1 year ago

GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 185

How FaR Are Large Language Models From Agents with Theory-of-Mind?

Paper • 2310.03051 • Published Oct 4, 2023 • 34

ExpertQA: Expert-Curated Questions and Attributed Answers

Paper • 2309.07852 • Published Sep 14, 2023 • 1

Wuerstchen: Efficient Pretraining of Text-to-Image Models

Paper • 2306.00637 • Published Jun 1, 2023 • 12

AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model

Paper • 2309.16058 • Published Sep 27, 2023 • 55

RWKV: Reinventing RNNs for the Transformer Era

Paper • 2305.13048 • Published May 22, 2023 • 15

Toolformer: Language Models Can Teach Themselves to Use Tools

Paper • 2302.04761 • Published Feb 9, 2023 • 11

LLaMA: Open and Efficient Foundation Language Models

Paper • 2302.13971 • Published Feb 27, 2023 • 13

The Flan Collection: Designing Data and Methods for Effective Instruction Tuning

Paper • 2301.13688 • Published Jan 31, 2023 • 8

Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 11

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Paper • 2201.11903 • Published Jan 28, 2022 • 9

Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 16

Training Compute-Optimal Large Language Models

Paper • 2203.15556 • Published Mar 29, 2022 • 10

High-Resolution Image Synthesis with Latent Diffusion Models

Paper • 2112.10752 • Published Dec 20, 2021 • 12

DINOv2: Learning Robust Visual Features without Supervision

Paper • 2304.07193 • Published Apr 14, 2023 • 5

Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets

Paper • 2201.02177 • Published Jan 6, 2022 • 2

The Forward-Forward Algorithm: Some Preliminary Investigations

Paper • 2212.13345 • Published Dec 27, 2022 • 2