Models - a imjliao Collection

imjliao 's Collections

Agent

Prompt

Entity

Information Retrieval

QA

Document Information Extraction

MLLM

AIF

Models

Models

updated Apr 12, 2024

JetMoE: Reaching Llama2 Performance with 0.1M Dollars

Paper • 2404.07413 • Published Apr 11, 2024 • 36
Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11, 2024 • 88
Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28, 2024 • 104
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2, 2024 • 104
Learning to Route Among Specialized Experts for Zero-Shot Generalization

Paper • 2402.05859 • Published Feb 8, 2024 • 5