Seeker: Towards Exception Safety Code Generation with Intermediate Language Agents Framework Paper • 2412.11713 • Published 9 days ago • 3
EvoCodeBench: An Evolving Code Generation Benchmark with Domain-Specific Evaluations Paper • 2410.22821 • Published Oct 30
Identity-Preserving Text-to-Video Generation by Frequency Decomposition Paper • 2411.17440 • Published 29 days ago • 35
LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference Paper • 2406.18139 • Published Jun 26 • 2
Seeker: Enhancing Exception Handling in Code with LLM-based Multi-Agent Approach Paper • 2410.06949 • Published Oct 9 • 5
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators Paper • 2404.05014 • Published Apr 7 • 32
Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach Paper • 2401.15652 • Published Jan 28
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation Paper • 2406.18522 • Published Jun 26 • 19
DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories Paper • 2405.19856 • Published May 30 • 8
DevEval: Evaluating Code Generation in Practical Software Projects Paper • 2401.06401 • Published Jan 12
EvoCodeBench: An Evolving Code Generation Benchmark Aligned with Real-World Code Repositories Paper • 2404.00599 • Published Mar 31 • 1
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? Paper • 2403.14624 • Published Mar 21 • 51
UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation Paper • 2308.07732 • Published Aug 15, 2023 • 2
GiT: Towards Generalist Vision Transformer through Universal Language Interface Paper • 2403.09394 • Published Mar 14 • 25
FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation Paper • 2403.06775 • Published Mar 11 • 3
view post Post Welcome Bunny! A family of lightweight but powerful multimodal models from BAAI With detailed work on dataset curation, the Bunny-3B model built upon SigLIP and Phi-2 achieves performance on par with 13B models.Model: BAAI/bunny-phi-2-siglip-lora 2 replies · ❤️ 4 4 + Reply
view post Post There appears to be a huge misunderstanding regarding the licensing requirements for open sourced Chinese speaking speaking LLMs on @huggingface I initially shared this misconception too, but after conducting some research, I came up with the list below. Veryimpressive! ❤️ 9 9 + Reply
view post Post Vision LLM for #edgecomputing ? @openbmb , who OS'ed the UltraFeedback dataset before, released a series of strong eco-friendly yet powerful LLMs- MiniCPM: 2B model that competes with Mistral-7B - MiniCPM-V: 3B vision LLM on edge!- MiniCPM-V: 3B vision LLM on edge! 👍 4 4 + Reply