HuggingFaceTB/SmolVLM-256M-Instruct
Image-Text-to-Text
โข
Updated
โข
3.86k
โข
65
nice! Looking forward to seeing your work!
Hi, nice work! do you think it's possible to replace the tts part of the current end-to-end model(https://huggingface.co/openbmb/MiniCPM-o-2_6) with kokoro, which I've heard is the perfect speed and size for end-side devices?