Cross-Encoder for MS MARCO - EN-DE (ONNX)

This repository hosts the ONNX format of the cross-lingual Cross-Encoder model for EN-DE, originally developed for passage re-ranking. It's based on the model trained using the MS Marco Passage Ranking dataset.

The ONNX version maintains the efficiency and compatibility required for deployment in various environments, facilitating easier integration with platforms supporting ONNX models.

For the original model and further details, refer to cross-encoder/msmarco-MiniLM-L12-en-de-v1 on Hugging Face.

Application

This model can be used for information retrieval tasks. For usage examples, see SBERT.net Retrieve & Re-rank.

Training Code

The training script used for the original model is available in this repository, see train_script.py.

How to Use in ONNX

To load and use the model in ONNX format, ensure you have the appropriate ONNX runtime installed in your environment. For example, we run this model using Metarank.

Model Evaluation

Best Re-ranker for German Texts

Based on an internal evaluation conducted by our development team, the cross-encoder/msmarco-MiniLM-L12-en-de-v1 model has been identified as one of the most effective re-rankers for German texts.

Evaluation Context

The primary goal was to enhance chatbot response accuracy and improve document relevance in response to user queries. Reranking is vital as it enhances the search results by placing the most relevant documents at the top, ensuring the chatbot delivers the most accurate responses based on user inputs.

Decision and Reasoning

After comprehensive testing against various models such as corrius/cross-encoder-mmarco-mMiniLMv2-L12-H384-v1 and multiple mixedbread-ai variants, the cross-encoder/msmarco-MiniLM-L12-en-de-v1 model demonstrated superior performance in processing and reranking German-language texts. It proved to be highly efficient in terms of accuracy and computational performance during benchmark tests involving complex query responses.

Advantages

  • Improved Accuracy: Leads to more precise responses from the chatbot, significantly enhancing user experience.
  • Market Relevance: Offers a competitive edge in German-speaking markets by effectively managing complex queries.

Integrating an ONNX Model with Metarank for German Text Re-ranking

Setup Local Metarank Instance

To run Metarank locally, you need to create a config.yml file and set up a Docker container. Below are the detailed steps:

Step 1: Create Configuration File

Create a config.yml file in the root of your Metarank project:

inference:
  msmarco:
    type: cross-encoder
    model: UlanYisaev/msmarco-MiniLM-L12-en-de-v1

Step 2: Run Metarank using Docker

cd metarank
docker run --name metarank -p 8080:8080 -v $(pwd):/opt/metarank metarank/metarank:latest serve --config /opt/metarank/config.yml

Step 3: Configure Metarank Docker Image to Include config.yml

Adjust the Docker configuration to include the config.yml in the build:

# Modify build.sbt in Metarank to include the custom script and config
new Dockerfile {
    add(new File("deploy/metarank_custom.sh"), "/metarank.sh")
    add(new File("config.yml"), "/opt/metarank/config.yml")
    entryPoint("/metarank.sh")
    cmd("--help")
}

# Build the Docker image
sbt docker

deploy/metarank_custom.sh:

#!/bin/bash

set -euxo pipefail
OPTS=${JAVA_OPTS:-"-Xmx1700m -verbose:gc"}

exec /usr/bin/java $OPTS -cp "/app/*" ai.metarank.main.Main serve --config /opt/metarank/config.yml

Step 4: Run and Manage the Docker Container

docker run --name metarank -p 8080:8080 metarank/metarank:0.7.8-amd64
docker stop metarank
docker start metarank
Downloads last month
13
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.