ONNX
uform-coreml-onnx / README.md
kimihailv's picture
Update README.md
40dfdf6
|
raw
history blame
1.65 kB
metadata
license: apache-2.0

UForm

Multi-Modal Inference Library
For Semantic Search Applications


UForm is a Multi-Modal Modal Inference package, designed to encode Multi-Lingual Texts, Images, and, soon, Audio, Video, and Documents, into a shared vector space!

This is the repository of English and multilingual UForm models converted to CoreML MLProgram format. Currently, only unimodal parts of models are converted.

Descriptions

Each model is separated into two parts: image-encoder and text-encoder:

Each checkpoint is a zip archive with an MLProgram of the corresponding encoder.

A text encoder has the following input fields:

  • input_ids: int32
  • attention_mask: int32

An image encoder has a single input field image: float32

Both encoders return:

  • features: float32
  • embeddings: float32