Embedding Model Datasets Collection A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers β’ 67 items β’ Updated Jul 3, 2024 β’ 92
FinTral: A Family of GPT-4 Level Multimodal Financial Large Language Models Paper β’ 2402.10986 β’ Published Feb 16, 2024 β’ 77
LayoutLM Collection The LayoutLM series are Transformer encoders useful for document AI tasks such as invoice parsing, document image classification and DocVQA. β’ 5 items β’ Updated 5 days ago β’ 14
SpeechT5 Collection The SpeechT5 framework consists of a shared seq2seq and six modal-specific (speech/text) pre/post-nets that can address a few audio-related tasks. β’ 8 items β’ Updated 5 days ago β’ 23