Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models Paper • 2404.13013 • Published Apr 19 • 30
FeatUp: A Model-Agnostic Framework for Features at Any Resolution Paper • 2403.10516 • Published Mar 15 • 16
Awesome Document AI Collection A collection of open-source document AI 📄 📝 📈 • 27 items • Updated Mar 11 • 75
DocLLM: A layout-aware generative language model for multimodal document understanding Paper • 2401.00908 • Published Dec 31, 2023 • 181
Recent models: last 100 repos, sorted by creation date Collection The last 100 repos I have created. Sorted by creation date descending, so the most recently created repos appear at the top. • 121 items • Updated Jan 31 • 506
LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching Paper • 2311.11284 • Published Nov 19, 2023 • 16