Post
977
Haystack can now see ๐
The latest release of the Haystack OSS LLM framework adds a long-requested feature: image support!
๐ Notebooks below
This isn't just about passing images to an LLM. We built several features to enable practical multimodal use cases.
What's new?
๐ง Support for multiple LLM providers: OpenAI, Amazon Bedrock, Google Gemini, Mistral, NVIDIA, OpenRouter, Ollama and more (support for Hugging Face API coming ๐)
๐๏ธ Prompt template language to handle structured inputs, including images
๐ PDF and image converters
๐ Image embedders using CLIP-like models
๐งพ LLM-based extractor to pull text from images
๐งฉ Components to build multimodal RAG pipelines and Agents
I had the chance of leading this effort with @sjrhuschlee (great collab).
๐ Below you can find two notebooks to explore the new features:
๓ ฏโข๓ ๓ Introduction to Multimodal Text Generation https://haystack.deepset.ai/cookbook/multimodal_intro
๓ ฏโข๓ ๓ Creating Vision+Text RAG Pipelines https://haystack.deepset.ai/tutorials/46_multimodal_rag
(๐ผ๏ธ image by @bilgeyucel )
The latest release of the Haystack OSS LLM framework adds a long-requested feature: image support!
๐ Notebooks below
This isn't just about passing images to an LLM. We built several features to enable practical multimodal use cases.
What's new?
๐ง Support for multiple LLM providers: OpenAI, Amazon Bedrock, Google Gemini, Mistral, NVIDIA, OpenRouter, Ollama and more (support for Hugging Face API coming ๐)
๐๏ธ Prompt template language to handle structured inputs, including images
๐ PDF and image converters
๐ Image embedders using CLIP-like models
๐งพ LLM-based extractor to pull text from images
๐งฉ Components to build multimodal RAG pipelines and Agents
I had the chance of leading this effort with @sjrhuschlee (great collab).
๐ Below you can find two notebooks to explore the new features:
๓ ฏโข๓ ๓ Introduction to Multimodal Text Generation https://haystack.deepset.ai/cookbook/multimodal_intro
๓ ฏโข๓ ๓ Creating Vision+Text RAG Pipelines https://haystack.deepset.ai/tutorials/46_multimodal_rag
(๐ผ๏ธ image by @bilgeyucel )