โ€
โ€โ€Cross Encoding
โ€โ€โ€โ€Model: MiniLM
โ€โ€โ€โ€Lang: IT
โ€โ€
โ€

Model description

This is a MiniLMv2 [1] model for the Italian language, obtained using mmarco-mMiniLMv2-L6-H384-v1 as a starting point and focusing it on the Italian language by modifying the embedding layer (as in [2], computing document-level frequencies over the Wikipedia dataset)

The resulting model has 23M parameters, a vocabulary of 30.498 tokens, and a size of ~90 MB.

References

[1] https://arxiv.org/abs/2012.15828

[2] https://arxiv.org/abs/2010.05609

License

The model is released under MIT license

Downloads last month
654
Safetensors
Model size
22.7M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Collection including osiria/minilm-l6-h384-italian-cross-encoder