Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs Paper ⢠2404.05719 ⢠Published Apr 8, 2024 ⢠83
Harnessing Webpage UIs for Text-Rich Visual Understanding Paper ⢠2410.13824 ⢠Published Oct 17, 2024 ⢠32
DocLayout-YOLO Collection Dataset and model for DocLayout-YOLO ⢠10 items ⢠Updated Jan 14 ⢠19
Flex3D: Feed-Forward 3D Generation With Flexible Reconstruction Model And Input View Curation Paper ⢠2410.00890 ⢠Published Oct 1, 2024 ⢠20