Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2411.04709

MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels

Paper • 2405.07526 • Published May 13 • 17
Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach

Paper • 2405.15613 • Published May 24 • 13
A Touch, Vision, and Language Dataset for Multimodal Alignment

Paper • 2402.13232 • Published Feb 20 • 13
How Do Large Language Models Acquire Factual Knowledge During Pretraining?

Paper • 2406.11813 • Published Jun 17 • 30

Papers - Image - Encoders - ViT

DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting

Paper • 2404.06903 • Published Apr 10 • 18
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

Paper • 2404.15653 • Published Apr 24 • 26
MoDE: CLIP Data Experts via Clustering

Paper • 2404.16030 • Published Apr 24 • 12
BlenderAlchemy: Editing 3D Graphics with Vision-Language Models

Paper • 2404.17672 • Published Apr 26 • 18

Papers - Image - Encoders - DinoV2

LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact Language Model

Paper • 2404.01331 • Published Mar 29 • 25
OmniFusion Technical Report

Paper • 2404.06212 • Published Apr 9 • 74
MoDE: CLIP Data Experts via Clustering

Paper • 2404.16030 • Published Apr 24 • 12
WildGaussians: 3D Gaussian Splatting in the Wild

Paper • 2407.08447 • Published Jul 11 • 8

Papers - Image - Encoders - Clip

TextCraftor: Your Text Encoder Can be Image Quality Controller

Paper • 2403.18978 • Published Mar 27 • 13
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation

Paper • 2404.02733 • Published Apr 3 • 20
OmniFusion Technical Report

Paper • 2404.06212 • Published Apr 9 • 74
Transferable and Principled Efficiency for Open-Vocabulary Segmentation

Paper • 2404.07448 • Published Apr 11 • 11

Papers - Image - EfficientNet

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Paper • 1905.11946 • Published May 28, 2019 • 3
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation

Paper • 2411.04709 • Published 10 days ago • 23

Papers - Image - Swin

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

Paper • 2103.14030 • Published Mar 25, 2021 • 4
A Novel Transformer Based Semantic Segmentation Scheme for Fine-Resolution Remote Sensing Images

Paper • 2104.12137 • Published Apr 25, 2021 • 2
Self-Supervised Learning with Swin Transformers

Paper • 2105.04553 • Published May 10, 2021 • 2
Evaluating Transformer-based Semantic Segmentation Networks for Pathological Image Segmentation

Paper • 2108.11993 • Published Aug 26, 2021 • 2

U-Net: Convolutional Networks for Biomedical Image Segmentation

Paper • 1505.04597 • Published May 18, 2015 • 8
Image Segmentation using U-Net Architecture for Powder X-ray Diffraction Images

Paper • 2310.16186 • Published Oct 24, 2023 • 2
H-DenseUNet: Hybrid Densely Connected UNet for Liver and Tumor Segmentation from CT Volumes

Paper • 1709.07330 • Published Sep 21, 2017 • 2
Deep LOGISMOS: Deep Learning Graph-based 3D Segmentation of Pancreatic Tumors on CT scans

Paper • 1801.08599 • Published Jan 25, 2018 • 2

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 10 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1 • 21
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1 • 80
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 144
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30 • 25

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs