Patching hf bug that creates wrong cache length if only inputs_embeds are passed to the model

#19

by tomer-nv - opened Oct 13, 2024

←

NVIDIA org Oct 13, 2024

No description provided.

itlevy changed pull request status to merged Oct 13, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment