Running into "Flash attention implementation does not support kwargs: prompt_length" when using the exact example from the Readme

#49
by HotSauce7 - opened

Hi folks,
thanks for the amazing work. Unfortunately when I use the exact example from your Readme (Model Card), which is:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("jinaai/jina-embeddings-v3", trust_remote_code=True)

task = "retrieval.query"
embeddings = model.encode(
    ["What is the weather like in Berlin today?"],
    task=task,
    prompt_name=task,
)

with both sentence-transformers==3.2.0 and sentence-transformers==3.1.0 I am getting the warning Flash attention implementation does not support kwargs: prompt_length. If I remove prompt_name=task this does not occur, but creates totally different embeddings.
I have flash-attn==2.6.2 installed.
Any ideas what I am missing?

Have you found the solution for it?

I am experiencing the same issue. Any advice would be greatly appreciated.

# https://huggingface.co/jinaai/xlm-roberta-flash-implementation/blob/main/modeling_xlm_roberta.py
# line 671
        adapter_mask = kwargs.pop("adapter_mask", None)
        if kwargs:
            for key, value in kwargs.items():
                if value is not None:
                    logger.warning(
                        "Flash attention implementation does not support kwargs: %s",
                        key,
                    )

The file could be find in path ~/.cache/huggingface/modules/transformers_modules/jinaai/xlm-roberta-flash-implementation/12700ba4972d9e900313a85ae855f5a76fb9500e

maybe we could decrease the logger level to debug, to rm the warning.

Hi, thanks for reporting the issue. This warning didn't really impact the model outputs but I still made a modification to stop passing prompt_length to the model, so you shouldn't see the warning anymore.

As for the argument itself, it seems some models need prompt_length to not include the prompt during pooling, but jina-embeddings-v3 doesn't do this, so it's not relevant for us.

@jupyterjazz Thank you for the quick response. It was rather an issue of confusion if the implementation with prompt_name=task was correct. Clarification and implementation of the solution makes perfect sense. Closing the issue!

HotSauce7 changed discussion status to closed

Sign up or log in to comment