GerColBERT
This is a PyLate model trained. It maps sentences & paragraphs to sequences of 128-dimensional dense vectors and can be used for semantic textual similarity using the MaxSim operator.
Model Details
Model Description
- Model Type: PyLate model
- Base model: deepset/gbert-base
- Document Length: 180 tokens
- Query Length: 32 tokens
- Output Dimensionality: 128 tokens
- Similarity Function: MaxSim
- Training Dataset: samheym/ger-dpr-collection
- Language: de
Usage
First install the PyLate library:
pip install -U pylate
Retrieval
PyLate provides a streamlined interface to index and retrieve documents using ColBERT models. The index leverages the Voyager HNSW index to efficiently handle document embeddings and enable fast retrieval.
from pylate import indexes, models, retrieve
# Step 1: Load the ColBERT model
model = models.ColBERT(
model_name_or_path=samheym/GerColBERT,
)
Training Details
Framework Versions
- Python: 3.12.3
- Sentence Transformers: 3.4.1
- PyLate: 1.1.4
- Transformers: 4.48.2
- PyTorch: 2.6.0+cu124
- Accelerate: 1.4.0
- Datasets: 2.21.0
- Tokenizers: 0.21.0
- Downloads last month
- 65
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The HF Inference API does not support sentence-similarity models for PyLate library.
Model tree for samheym/GerColBERT
Base model
deepset/gbert-base