vllm <SpeechHere> token id
#3
by
jitanus
- opened
It seems in the prompt_token_ids the index for "SpeechHere" is 256000 and we get this error
AssertionError: The text input contains 0 audio tokens, but 1 audios provided
since speech_token_index is expected to be 255999.
Hi, Can you provide your inference code?
Please make sure the tokenizer_mode
argument is set to slow
. For now, we simply modified Gemma's tokenizer_config.json
and changed its token 255999 from <unused99>
to <SpeechHere>
, but huggingface fast tokenizer can't recognize it.
Thanks for your prompt reply. That was it! I missed setting tokenizer_mode to "slow".
no problem :)
jitanus
changed discussion status to
closed