Olivia Chen's picture

12 16

Olivia Chen

OliviaChenSir

·

AI & ML interests

None yet

Recent Activity

liked a model about 2 months ago

hfl/chinese-lert-small

liked a model about 2 months ago

microsoft/GODEL-v1_1-large-seq2seq

liked a model about 2 months ago

google/switch-base-8

View all activity

Organizations

None yet

OliviaChenSir's activity

upvoted 12 papers 3 months ago

SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation

Paper • 2411.04989 • Published Nov 7, 2024 • 14

GazeGen: Gaze-Driven User Interaction for Visual Content Generation

Paper • 2411.04335 • Published Nov 7, 2024 • 14

M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models

Paper • 2411.04075 • Published Nov 6, 2024 • 16

M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding

Paper • 2411.04952 • Published Nov 7, 2024 • 28

Analyzing The Language of Visual Tokens

Paper • 2411.05001 • Published Nov 7, 2024 • 23

Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models

Paper • 2411.05005 • Published Nov 7, 2024 • 13

DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion

Paper • 2411.04928 • Published Nov 7, 2024 • 49

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Paper • 2411.04996 • Published Nov 7, 2024 • 50

TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation

Paper • 2411.04709 • Published Nov 5, 2024 • 25

BitNet a4.8: 4-bit Activations for 1-bit LLMs

Paper • 2411.04965 • Published Nov 7, 2024 • 64

ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Paper • 2411.05003 • Published Nov 7, 2024 • 70

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published Nov 7, 2024 • 115