--- tags: - sentence-transformers - sentence-similarity - feature-extraction - transformers - Qwen2 license: other license_name: qodoai-open-rail-m license_link: LICENSE pipeline_tag: sentence-similarity library_name: sentence-transformers base_model: Alibaba-NLP/gte-Qwen2-1.5B-instruct --- ## Qodo-Embed-1 **Qodo-Embed-1 is a state-of-the-art** code embedding model designed for retrieval tasks in the software development domain. It is offered in two sizes: lite (1.5B) and medium (7B). The model is optimized for natural language-to-code and code-to-code retrieval, making it highly effective for applications such as code search, retrieval-augmented generation (RAG), and contextual understanding of programming languages. This model outperforms all previous open-source models in the COIR and MTEB leaderboards, achieving best-in-class performance with a significantly smaller size compared to competing models. ### Languages Supported: * Python * C++ * C# * Go * Java * Javascript * PHP * Ruby * Typescript ## Model Information - Model Size: 1.5B - Embedding Dimension: 1536 - Max Input Tokens: 32k ## Requirements ``` transformers>=4.39.2 flash_attn>=2.5.6 ``` ## Usage ### Sentence Transformers ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("Qodo/Qodo-Embed-1-1.5B") # Run inference sentences = [ 'accumulator = sum(item.value for item in collection)', 'result = reduce(lambda acc, curr: acc + curr.amount, data, 0)', 'matrix = [[i*j for j in range(n)] for i in range(n)]' ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 1536] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ### Transformers ```python import torch import torch.nn.functional as F from torch import Tensor from transformers import AutoTokenizer, AutoModel def last_token_pool(last_hidden_states: Tensor, attention_mask: Tensor) -> Tensor: left_padding = (attention_mask[:, -1].sum() == attention_mask.shape[0]) if left_padding: return last_hidden_states[:, -1] else: sequence_lengths = attention_mask.sum(dim=1) - 1 batch_size = last_hidden_states.shape[0] return last_hidden_states[torch.arange(batch_size, device=last_hidden_states.device), sequence_lengths] # Each query must come with a one-sentence instruction that describes the task queries = [ 'how to handle memory efficient data streaming', 'implement binary tree traversal' ] documents = [ """def process_in_chunks(): buffer = deque(maxlen=1000) for record in source_iterator: buffer.append(transform(record)) if len(buffer) >= 1000: yield from buffer buffer.clear()""", """class LazyLoader: def __init__(self, source): self.generator = iter(source) self._cache = [] def next_batch(self, size=100): while len(self._cache) < size: try: self._cache.append(next(self.generator)) except StopIteration: break return self._cache.pop(0) if self._cache else None""", """def dfs_recursive(root): if not root: return [] stack = [] stack.extend(dfs_recursive(root.right)) stack.append(root.val) stack.extend(dfs_recursive(root.left)) return stack""" ] input_texts = queries + documents tokenizer = AutoTokenizer.from_pretrained('Qodo/Qodo-Embed-1-1.5B', trust_remote_code=True) model = AutoModel.from_pretrained('Qodo/Qodo-Embed-1-1.5B', trust_remote_code=True) max_length = 8192 # Tokenize the input texts batch_dict = tokenizer(input_texts, max_length=max_length, padding=True, truncation=True, return_tensors='pt') outputs = model(**batch_dict) embeddings = last_token_pool(outputs.last_hidden_state, batch_dict['attention_mask']) # normalize embeddings embeddings = F.normalize(embeddings, p=2, dim=1) scores = (embeddings[:2] @ embeddings[2:].T) * 100 print(scores.tolist()) ``` ## License [QodoAI-Open-RAIL-M](https://www.qodo.ai/open-rail-m-license/)