Support added to openedai-vision

by matatonic - opened Oct 10, 2024

Oct 10, 2024

Great model! It's really smart and gets fine details well. Way more than just captions now.

Just FYI, and I hope it was ok (I took some liberties with your spaces code), I added support for llama-joycaption-alpha-two (from the demo space) to openedai-vision and also added multi-image support. See: https://github.com/matatonic/openedai-vision/blob/main/backend/joy-caption-latest.py

I'll add this repo once it's done. Thanks again!

fancyfeast

Owner Oct 12, 2024

Very cool, thank you.

I hope it was ok

Of course, the model is free for you and everyone to use!

And I recommend checking out the example code on the github now: https://github.com/fpgaminer/joycaption?tab=readme-ov-file#example-usage
I've simplified the usage now that the model is packaged into HF's Llava class. (No Processor yet, but at least this eliminates the custom ImageAdapter class and such).

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment