rubatoyeong (Jinyeong Kim)

upvoted 3 papers 2 months ago

ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning

Paper • 2506.09513 • Published Jun 11 • 98

Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization

Paper • 2506.10920 • Published Jun 12 • 6

Revisit What You See: Disclose Language Prior in Vision Tokens for Efficient Guided Decoding of LVLMs

Paper • 2506.09522 • Published Jun 11 • 20

upvoted a paper 3 months ago

Softpick: No Attention Sink, No Massive Activations with Rectified Softmax

Paper • 2504.20966 • Published Apr 29 • 32

upvoted 16 papers 4 months ago

CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning

Paper • 2504.13820 • Published Apr 18 • 17

It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization

Paper • 2504.13173 • Published Apr 17 • 19

Describe Anything: Detailed Localized Image and Video Captioning

Paper • 2504.16072 • Published Apr 22 • 63

NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation

Paper • 2504.13055 • Published Apr 17 • 19

Perception Encoder: The best visual embeddings are not at the output of the network

Paper • 2504.13181 • Published Apr 17 • 35

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

Paper • 2504.13169 • Published Apr 17 • 39

FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding

Paper • 2504.09925 • Published Apr 14 • 38

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14 • 280

Towards Visual Text Grounding of Multimodal Large Language Model

Paper • 2504.04974 • Published Apr 7 • 16

Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models

Paper • 2504.07951 • Published Apr 10 • 29

Jinyeong Kim

AI & ML interests

Organizations

ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning

Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization

Revisit What You See: Disclose Language Prior in Vision Tokens for Efficient Guided Decoding of LVLMs

Softpick: No Attention Sink, No Massive Activations with Rectified Softmax

CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning

It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization

Describe Anything: Detailed Localized Image and Video Captioning

NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation

Perception Encoder: The best visual embeddings are not at the output of the network

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Towards Visual Text Grounding of Multimodal Large Language Model

Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models

Self-Steering Language Models

DDT: Decoupled Diffusion Transformer

Clinical ModernBERT: An efficient and long context encoder for biomedical text

Concept Lancet: Image Editing with Compositional Representation Transplant

SmolVLM: Redefining small and efficient multimodal models

Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models

Jinyeong Kim

AI & ML interests

Organizations

rubatoyeong's activity