Jaemin Cho's picture

Jaemin Cho

j-min

·

https://j-min.io

AI & ML interests

None yet

Recent Activity

authored a paper 5 days ago

Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents

commented on a paper 6 days ago

Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents

upvoted a paper 6 days ago

Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents

View all activity

Organizations

commented a paper 6 days ago

Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents

Paper • 2508.05954 • Published 10 days ago • 6 •

New activity in j-min/PaintSkills 29 days ago

Upload train images (in zip files)

#8 opened 29 days ago by

Upload count/train_images

#7 opened 2 months ago by

commented a paper 30 days ago

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

Paper • 2507.13344 • Published Jul 17 • 55 •

New activity in j-min/PaintSkills 2 months ago

Upload spatial/val_images

#5 opened 2 months ago by

Upload object/val_images

#4 opened 2 months ago by

Upload count/val_images

#3 opened 2 months ago by

commented a paper 2 months ago

Video-Skill-CoT: Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning

Paper • 2506.03525 • Published Jun 4 • 6 •

commented a paper 3 months ago

EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance

Paper • 2505.21876 • Published May 28 • 9 •

New activity in j-min/vicuna-13b-v0-merged 3 months ago

Adding `safetensors` variant of this model

#2 opened 3 months ago by

commented a paper 4 months ago

CAPTURe: Evaluating Spatial Reasoning in Vision Language Models via Occluded Object Counting

Paper • 2504.15485 • Published Apr 21 • 5 •

commented a paper 7 months ago

M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding

Paper • 2411.04952 • Published Nov 7, 2024 • 30 •

New activity in j-min/layoutbench 7 months ago

[bot] Conversion to Parquet

#1 opened 8 months ago by

parquet-converter

commented 3 papers 9 months ago

VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement

Paper • 2411.15115 • Published Nov 22, 2024 • 9 •

VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement

Paper • 2411.15115 • Published Nov 22, 2024 • 9 •

M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding

Paper • 2411.04952 • Published Nov 7, 2024 • 30 •

New activity in j-min/reco_sd14_laion 10 months ago

customized dataset

#1 opened over 1 year ago by

New activity in yubo2333/MMLongBench-Doc about 1 year ago

dataset download error

#2 opened about 1 year ago by

New activity in mulan-dataset/v1.0 over 1 year ago

Example script for adding one object at a time

#9 opened over 1 year ago by

New activity in j-min/vicuna-13b-v0-merged about 2 years ago

Upload folder using huggingface_hub

#1 opened about 2 years ago by