V100 GPU Support？

by deleted - opened Apr 23, 2024

Discussion

deleted

Apr 23, 2024

•

edited Apr 23, 2024

Could you please support V100’s inference? Thank you very much.

czczup

OpenGVLab org Apr 25, 2024

Hello, thanks for your attention. We are preparing quantitative models that can run on a 32G V100.

deleted

Apr 26, 2024

Thank you very much for your reply. Looking forward to further developments!

z-hb

Apr 28, 2024

Look forward to further developments!

whai362

Apr 28, 2024

https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5-Int8

z-hb

Apr 28, 2024

https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5-Int8
Thanks. The key problem for V100 is the flash attention. I believe that quantitative models can not solve it without changing the attention code.

deleted

May 14, 2024

https://github.com/OpenGVLab/InternVL/issues/144

deleted changed discussion status to closed May 14, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment