Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning Paper • 2412.11974 • Published 10 days ago • 8
Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning Paper • 2412.11974 • Published 10 days ago • 8
M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework Paper • 2411.06176 • Published Nov 9 • 44
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse Paper • 2409.11242 • Published Sep 17 • 5 • 2
Language Model Unalignment: Parametric Red-Teaming to Expose Hidden Harms and Biases Paper • 2310.14303 • Published Oct 22, 2023 • 1
Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model Paper • 2311.00968 • Published Nov 2, 2023
PuzzleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Abstract Visual Patterns Paper • 2403.13315 • Published Mar 20
CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency Models Paper • 2404.00569 • Published Mar 31 • 1
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling Paper • 2406.11617 • Published Jun 17 • 8
Ruby Teaming: Improving Quality Diversity Search with Memory for Automated Red Teaming Paper • 2406.11654 • Published Jun 17 • 6
Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique Paper • 2408.10701 • Published Aug 20 • 11