|
--- |
|
language: |
|
- de |
|
tags: |
|
- ColBERT |
|
- PyLate |
|
- sentence-transformers |
|
- sentence-similarity |
|
pipeline_tag: sentence-similarity |
|
library_name: PyLate |
|
datasets: |
|
- samheym/ger-dpr-collection |
|
base_model: |
|
- deepset/gbert-base |
|
--- |
|
|
|
# GerColBERT |
|
|
|
This is a [PyLate](https://github.com/lightonai/pylate) model trained. It maps sentences & paragraphs to sequences of 128-dimensional dense vectors and can be used for semantic textual similarity using the MaxSim operator. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
- **Model Type:** PyLate model |
|
- **Base model:** [deepset/gbert-base](https://huggingface.co/deepset/gbert-base) |
|
- **Document Length:** 180 tokens |
|
- **Query Length:** 32 tokens |
|
- **Output Dimensionality:** 128 tokens |
|
- **Similarity Function:** MaxSim |
|
- **Training Dataset:** samheym/ger-dpr-collection |
|
- **Language:** de |
|
<!-- - **License:** Unknown --> |
|
|
|
|
|
|
|
## Usage |
|
First install the PyLate library: |
|
|
|
```bash |
|
pip install -U pylate |
|
``` |
|
|
|
### Retrieval |
|
|
|
PyLate provides a streamlined interface to index and retrieve documents using ColBERT models. The index leverages the Voyager HNSW index to efficiently handle document embeddings and enable fast retrieval. |
|
|
|
```python |
|
from pylate import indexes, models, retrieve |
|
|
|
# Step 1: Load the ColBERT model |
|
model = models.ColBERT( |
|
model_name_or_path=samheym/GerColBERT, |
|
) |
|
``` |
|
|
|
|
|
|
|
## Training Details |
|
|
|
### Framework Versions |
|
- Python: 3.12.3 |
|
- Sentence Transformers: 3.4.1 |
|
- PyLate: 1.1.4 |
|
- Transformers: 4.48.2 |
|
- PyTorch: 2.6.0+cu124 |
|
- Accelerate: 1.4.0 |
|
- Datasets: 2.21.0 |
|
- Tokenizers: 0.21.0 |
|
|
|
<!-- |
|
## Citation |
|
|
|
### BibTeX |
|
|
|
<!-- |
|
## Glossary |
|
|
|
*Clearly define terms in order to be accessible across audiences.* |
|
--> |
|
|
|
<!-- |
|
## Model Card Authors |
|
|
|
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.* |
|
--> |
|
|
|
<!-- |
|
## Model Card Contact |
|
|
|
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.* |
|
--> |