MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval Paper • 2412.14475 • Published 7 days ago • 51
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation Paper • 2402.03216 • Published Feb 5 • 4
RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder Paper • 2205.12035 • Published May 24, 2022
C-Pack: Packaged Resources To Advance General Chinese Embedding Paper • 2309.07597 • Published Sep 14, 2023 • 1
Distill-VQ: Learning Retrieval Oriented Vector Quantization By Distilling Knowledge from Dense Embeddings Paper • 2204.00185 • Published Apr 1, 2022
Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval Paper • 2201.05409 • Published Jan 14, 2022
Matching-oriented Product Quantization For Ad-hoc Retrieval Paper • 2104.07858 • Published Apr 16, 2021
LM-Cocktail: Resilient Tuning of Language Models via Model Merging Paper • 2311.13534 • Published Nov 22, 2023 • 4
Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon Paper • 2401.03462 • Published Jan 7 • 27