Collections
Discover the best community collections!
Collections trending this week
-
Nougat: Neural Optical Understanding for Academic Documents
Paper • 2308.13418 • Published • 37 -
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
Paper • 2307.02499 • Published • 15 -
Text Rendering Strategies for Pixel Language Models
Paper • 2311.00522 • Published • 12
-
DunnBC22/yolos-small-Axial_MRIs
Object Detection • Updated • 65 • 2 -
DunnBC22/bert-base-cased-finetuned-ner-BC2GM-IOB
Token Classification • Updated • 37 • 1 -
DunnBC22/efficientnet-b5-Brain_Tumors_Image_Classification
Image Classification • Updated • 31 -
DunnBC22/vit-base-patch16-224-in21k_lung_and_colon_cancer
Image Classification • Updated • 2.48k • 4
-
DunnBC22/pegasus-multi_news-NewsSummarization_BBC
Summarization • Updated • 683 • • 2 -
DunnBC22/flan-t5-base-text_summarization_data_6_epochs
Summarization • Updated • 58 • • 2 -
DunnBC22/flan-t5-base-text_summarization_data
Summarization • Updated • 52 • • 2 -
DunnBC22/led-base-16384-text_summarization_data
Summarization • Updated • 38 • 1
-
LLaSM: Large Language and Speech Model
Paper • 2308.15930 • Published • 33 -
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer
Paper • 2308.06873 • Published • 26 -
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining
Paper • 2308.05734 • Published • 37 -
JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models
Paper • 2308.04729 • Published • 32
-
VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation
Paper • 2309.00398 • Published • 22 -
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
Paper • 2307.04725 • Published • 64 -
LEDITS: Real Image Editing with DDPM Inversion and Semantic Guidance
Paper • 2307.00522 • Published • 32 -
VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning
Paper • 2309.15091 • Published • 33