Introduction
Based on dunzhang/stella_en_1.5B_v5 and google/siglip-so400m-patch14-384.
It can encode both text and images.
Report: https://arxiv.org/abs/2412.19048
Codes: https://github.com/NLPJCL/RAG-Retrieval
Data: https://huggingface.co/datasets/infgrad/jasper_text_distill_dataset
Training logs: https://api.wandb.ai/links/dunnzhang0/z8jqoqpb
The core idea of jasper and stella is distillation: Let student model learn teacher model's vectors.
Usage
import torch
from sentence_transformers import SentenceTransformer
DOC1 = """
Blue light is scattered in all directions by the tiny molecules of air in Earth's atmosphere.
Blue is scattered more than other colors because it travels as shorter, smaller waves. This is why we see a blue sky most of the time.
Closer to the horizon, the sky fades to a lighter blue or white.
"""
DOC2 = """
When choosing colors, you can consider the following factors:
Color theory: Understand how colors work together and how they can evoke different reactions.
Color psychology: Consider how colors affect emotions, behaviors, and responses.
Brand identity: Colors can convey meaning and information about a brand.
Mood: Consider the mood you want to create. For example, brighter colors can feel cheerful, while cooler colors can be calming.
Space: Consider the size of the space and the amount of natural light it receives. Dark colors can make a room feel smaller, while light colors can make it feel larger.
Color wheel: Use the color wheel to identify primary, secondary, and tertiary colors.
Color combinations: Decide how to best complement your preferred color with others.
Color palette: Limit your color palette to a main color and one or two additional colors.
60-30-10 rule: Use a primary color 60% of the time, a secondary color 30% of the time, and an accent color 10% of the time
"""
if __name__ == "__main__":
# load model
use_gpu = False
model_name = "infgrad/jasper_en_vision_language_v1"
model = SentenceTransformer(
model_name,
trust_remote_code=True,
device="cpu" if not use_gpu else "cuda",
model_kwargs={
"torch_dtype": torch.bfloat16 if use_gpu else torch.float32,
"attn_implementation": "sdpa"
},
# vector_dim must be 12288, 1024, 512, 256
## 1024 is recommended
# set is_text_encoder 'True', if you do not encode image
config_kwargs={"is_text_encoder": False, "vector_dim": 1024},
)
# We can reduce the max_seq_length from the default of 2048 for faster encoding
model.max_seq_length = 1024
# data
q_list = [
"Why the sky is blue?",
"how to choose suitable color",
]
doc_list = [
DOC1,
[{"type": "image_path", "content": "./assets/img1.png"}, {"type": "text", "content": "Hope this image helps!"}],
DOC2,
[{"type": "image_path", "content": "./assets/img2.png"}],
]
q_vecs = model.encode(q_list, prompt_name="s2p_query")
doc_vecs = model.encode(doc_list)
# calculate similarity
similarities = model.similarity(q_vecs, doc_vecs)
print(similarities)
# the output is:
# tensor([[0.7775, 0.7594, 0.2429, 0.2187],
# [0.3226, 0.3054, 0.7421, 0.5484]])
Evaluation on MTEB
script: ./scripts/evaluate_en_mteb/run_evaluate_mteb.py
License
This model should not be used for any commercial purpose!
Citation
@misc{zhang2025jasperstelladistillationsota,
title={Jasper and Stella: distillation of SOTA embedding models},
author={Dun Zhang and Jiacheng Li and Ziyang Zeng and Fulong Wang},
year={2025},
eprint={2412.19048},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2412.19048},
}
- Downloads last month
- 15,835
Model tree for NovaSearch/jasper_en_vision_language_v1
Base model
NovaSearch/stella_en_1.5B_v5Datasets used to train NovaSearch/jasper_en_vision_language_v1
Space using NovaSearch/jasper_en_vision_language_v1 1
Evaluation results
- accuracy on MTEB AmazonCounterfactualClassification (en-ext)test set self-reported95.727
- f1 on MTEB AmazonCounterfactualClassification (en-ext)test set self-reported89.255
- f1_weighted on MTEB AmazonCounterfactualClassification (en-ext)test set self-reported95.856
- ap on MTEB AmazonCounterfactualClassification (en-ext)test set self-reported67.156
- ap_weighted on MTEB AmazonCounterfactualClassification (en-ext)test set self-reported67.156
- main_score on MTEB AmazonCounterfactualClassification (en-ext)test set self-reported95.727
- accuracy on MTEB AmazonCounterfactualClassification (en)test set self-reported93.776
- f1 on MTEB AmazonCounterfactualClassification (en)test set self-reported90.758
- f1_weighted on MTEB AmazonCounterfactualClassification (en)test set self-reported93.974
- ap on MTEB AmazonCounterfactualClassification (en)test set self-reported74.888