Bridging the Data Provenance Gap Across Text, Speech and Video Paper • 2412.17847 • Published 6 days ago • 1
Falcon3 Collection Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated 6 days ago • 70
view article Article Comparing Open-source and Proprietary LLMs in Medical AI By mpimentel • Oct 3 • 16
Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models Paper • 2412.02980 • Published 21 days ago • 12
Insight-V Collection Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models • 5 items • Updated Nov 22 • 9
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 9 items • Updated 28 days ago • 99
Biomedical Collection Models for biomedical research applications, such as radiology report generation and biomedical language understanding. • 9 items • Updated Nov 1 • 5
FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI Paper • 2411.04872 • Published Nov 7 • 4
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 15 items • Updated 2 days ago • 195
LLM Reasoning Papers Collection Papers to improve reasoning capabilities of LLMs • 17 items • Updated 2 days ago • 89
inftyBench: Extending Long Context Evaluation Beyond 100K Tokens Paper • 2402.13718 • Published Feb 21 • 1
JPEG-LM: LLMs as Image Generators with Canonical Codec Representations Paper • 2408.08459 • Published Aug 15 • 45
Data Authenticity, Consent, & Provenance for AI are all broken: what will it take to fix them? Paper • 2404.12691 • Published Apr 19 • 2
The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI Paper • 2310.16787 • Published Oct 25, 2023 • 5