Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs Paper β’ 2404.05719 β’ Published Apr 8 β’ 82
Running 121 π―οΈπ Candle Segment Anything Wasm Segment Anything Model on the Browser with Candle/Rust/WASM
Harnessing Webpage UIs for Text-Rich Visual Understanding Paper β’ 2410.13824 β’ Published Oct 17 β’ 29