Multimodal My Collection of models that I want to checkout microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated May 1 • 282k • 1.46k
Document AI All the papers that can fundementally help in creating a true open-source processing pipeline. General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model Paper • 2409.01704 • Published Sep 3, 2024 • 84
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model Paper • 2409.01704 • Published Sep 3, 2024 • 84
Multimodal My Collection of models that I want to checkout microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated May 1 • 282k • 1.46k
Document AI All the papers that can fundementally help in creating a true open-source processing pipeline. General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model Paper • 2409.01704 • Published Sep 3, 2024 • 84
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model Paper • 2409.01704 • Published Sep 3, 2024 • 84