Different results between Jax Space and the HF Transformers Space

#2
by Shalev - opened

From https://huggingface.co/spaces/big-vision/paligemma - the Jax model works well.
image.png

But the https://huggingface.co/spaces/big-vision/paligemma-hf space just selects the entire image (on the same input). I'm trying to reproduce the (better) Jax behavior on HF transformers, but I can't figure out what's being done differently on the Jax side. Any tips would be appreciated!

Seeing similar issues, is there a difference in the HF version?

@Shalev @codelion I will debug and come back to you on this

Hi, how can we decode the segmentation tokens into binary mask for object segmentation?

@codelion Thank you

@merve Did you find any solution on why the HF version does not perform? I am having the same issue as @Shalev but in segmentation. It would return a mas of zeros in HF version while works pretty well on jax.

Sign up or log in to comment