Bart Patterson's picture

72

Bart Patterson

bartpatterson

AI & ML interests

None yet

Organizations

None yet

bartpatterson's activity

upvoted 60 papers 3 months ago

VolDoGer: LLM-assisted Datasets for Domain Generalization in Vision-Language Tasks

Paper • 2407.19795 • Published Jul 29 • 10

Sentiment Analysis of Lithuanian Online Reviews Using Large Language Models

Paper • 2407.19914 • Published Jul 29 • 12

ImagiNet: A Multi-Content Dataset for Generalizable Synthetic Image Detection via Contrastive Learning

Paper • 2407.20020 • Published Jul 29 • 19

SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages

Paper • 2407.19672 • Published Jul 29 • 54

FreeLong: Training-Free Long Video Generation with SpectralBlend Temporal Attention

Paper • 2407.19918 • Published Jul 29 • 47

Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning

Paper • 2407.20798 • Published Jul 30 • 23

Meltemi: The first open Large Language Model for Greek

Paper • 2407.20743 • Published Jul 30 • 67

Futga: Towards Fine-grained Music Understanding through Temporally-enhanced Generative Augmentation

Paper • 2407.20445 • Published Jul 29 • 20

Knesset-DictaBERT: A Hebrew Language Model for Parliamentary Proceedings

Paper • 2407.20581 • Published Jul 30 • 23

Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

Paper • 2405.21075 • Published May 31 • 18

Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling

Paper • 2405.21048 • Published May 31 • 12

SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency

Paper • 2407.17470 • Published Jul 24 • 14

PERSONA: A Reproducible Testbed for Pluralistic Alignment

Paper • 2407.17387 • Published Jul 24 • 17

HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation

Paper • 2407.17438 • Published Jul 24 • 23

Integrating Large Language Models into a Tri-Modal Architecture for Automated Depression Classification

Paper • 2407.19340 • Published Jul 27 • 56

Cycle3D: High-quality and Consistent Image-to-3D Generation via Generation-Reconstruction Cycle

Paper • 2407.19548 • Published Jul 28 • 22

3D Question Answering for City Scene Understanding

Paper • 2407.17398 • Published Jul 24 • 21

ATHAR: A High-Quality and Diverse Dataset for Classical Arabic to English Translation

Paper • 2407.19835 • Published Jul 29 • 20

SkipDecode: Autoregressive Skip Decoding with Batching and Caching for Efficient LLM Inference

Paper • 2307.02628 • Published Jul 5, 2023 • 10

Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers

Paper • 2307.03183 • Published Jul 6, 2023 • 10

A Survey on Evaluation of Large Language Models

Paper • 2307.03109 • Published Jul 6, 2023 • 42

ReMaX: Relaxing for Better Training on Efficient Panoptic Segmentation

Paper • 2306.17319 • Published Jun 29, 2023 • 3

The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit

Paper • 2306.17759 • Published Jun 30, 2023 • 4

Statler: State-Maintaining Language Models for Embodied Reasoning

Paper • 2306.17840 • Published Jun 30, 2023 • 12

Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors

Paper • 2306.17843 • Published Jun 30, 2023 • 43

SHIC: Shape-Image Correspondences with no Keypoint Supervision

Paper • 2407.18907 • Published Jul 26 • 39

Show, Don't Tell: Aligning Language Models with Demonstrated Feedback

Paper • 2406.00888 • Published Jun 2 • 30

Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models

Paper • 2403.12881 • Published Mar 19 • 16

AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability

Paper • 2405.14129 • Published May 23 • 12

Imp: Highly Capable Large Multimodal Models for Mobile Devices

Paper • 2405.12107 • Published May 20 • 25

WavLLM: Towards Robust and Adaptive Speech Large Language Model

Paper • 2404.00656 • Published Mar 31 • 9

Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model

Paper • 2404.04167 • Published Apr 5 • 12

Synth^2: Boosting Visual-Language Models with Synthetic Captions and Image Embeddings

Paper • 2403.07750 • Published Mar 12 • 21

MoAI: Mixture of All Intelligence for Large Language and Vision Models

Paper • 2403.07508 • Published Mar 12 • 75

Resonance RoPE: Improving Context Length Generalization of Large Language Models

Paper • 2403.00071 • Published Feb 29 • 22

Beyond Language Models: Byte Models are Digital World Simulators

Paper • 2402.19155 • Published Feb 29 • 49

ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition

Paper • 2402.15220 • Published Feb 23 • 19

Genie: Generative Interactive Environments

Paper • 2402.15391 • Published Feb 23 • 70

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

Paper • 2402.00159 • Published Jan 31 • 59

OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1 • 79

Wolf: Captioning Everything with a World Summarization Framework

Paper • 2407.18908 • Published Jul 26 • 30

StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text

Paper • 2403.14773 • Published Mar 21 • 9

LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

Paper • 2403.11703 • Published Mar 18 • 16

LightIt: Illumination Modeling and Control for Diffusion Models

Paper • 2403.10615 • Published Mar 15 • 16

Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation

Paper • 2403.12015 • Published Mar 18 • 63

Design2Code: How Far Are We From Automating Front-End Engineering?

Paper • 2403.03163 • Published Mar 5 • 93

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5 • 56

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

Paper • 2403.03100 • Published Mar 5 • 34

ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models

Paper • 2403.02084 • Published Mar 4 • 14

DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models

Paper • 2403.00818 • Published Feb 26 • 14

TripoSR: Fast 3D Object Reconstruction from a Single Image

Paper • 2403.02151 • Published Mar 4 • 11

TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4 • 89

LLaMA Pro: Progressive LLaMA with Block Expansion

Paper • 2401.02415 • Published Jan 4 • 53

Pheme: Efficient and Conversational Speech Generation

Paper • 2401.02839 • Published Jan 5 • 16

Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively

Paper • 2401.02955 • Published Jan 5 • 19

Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM

Paper • 2401.02994 • Published Jan 4 • 47

Has Your Pretrained Model Improved? A Multi-head Posterior Based Approach

Paper • 2401.02987 • Published Jan 2 • 9

AGG: Amortized Generative 3D Gaussians for Single Image to 3D

Paper • 2401.04099 • Published Jan 8 • 8

Mixtral of Experts

Paper • 2401.04088 • Published Jan 8 • 157

NNsight and NDIF: Democratizing Access to Foundation Model Internals

Paper • 2407.14561 • Published Jul 18 • 34