license: mit | |
# akhooli/arabic-colbertv2-711k-norm | |
This is a ColBERT V2 model trained on Arabic mMARCO dataset after removing queries with Latin words (711K queries). | |
It is not fully trained, but is good for many tasks especially ranking. | |
The dataset was normalized before training, so please normalize your query and docs before using it. | |
```python | |
from unicodedata import normalize | |
query_n = normalize('NFKC', query) | |
``` |