Question Answering
Transformers
English
Chinese
multimodal
vqa
text
audio
Eval Results
Inference Endpoints
File size: 22 Bytes
5a2ea64
1
trust_remote_code=True