Cross Encoding
    Model: MiniLM
    Lang: IT
  

Model description

This is a MiniLMv2 [1] model for the Italian language, obtained using mmarco-mMiniLMv2-L12-H384-v1 as a starting point and focusing it on the Italian language by modifying the embedding layer (as in [2], computing document-level frequencies over the Wikipedia dataset)

The resulting model has 33M parameters, a vocabulary of 30.498 tokens, and a size of ~130 MB.

References

[1] https://arxiv.org/abs/2012.15828

[2] https://arxiv.org/abs/2010.05609

License

The model is released under MIT license

Downloads last month
39
Safetensors
Model size
33.4M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including osiria/minilm-l12-h384-italian-cross-encoder