SeanLee97 michaelfeil commited on
Commit
e785744
·
verified ·
1 Parent(s): b5d1f23

Update Readme: formatting and usage with infinity (#20)

Browse files

- Update Readme: formatting and usage with infinity (be66459b0b8d7f26d21f260f58ae34c1edca8b95)


Co-authored-by: Michael <[email protected]>

Files changed (1) hide show
  1. README.md +17 -8
README.md CHANGED
@@ -2665,11 +2665,11 @@ binary_docs_embeddings = quantize_embeddings(docs_embeddings, precision="ubinary
2665
 
2666
  similarities = cos_sim(query_embedding, docs_embeddings)
2667
  print('similarities:', similarities)
2668
-
2669
 
2670
  ### Transformers
2671
 
2672
-
2673
  from typing import Dict
2674
 
2675
  import torch
@@ -2717,18 +2717,19 @@ embeddings = pooling(outputs, inputs, 'cls')
2717
 
2718
  similarities = cos_sim(embeddings[0], embeddings[1:])
2719
  print('similarities:', similarities)
2720
-
2721
 
2722
  ### Transformers.js
2723
 
2724
  If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@xenova/transformers) using:
2725
 
 
2726
  npm i @xenova/transformers
2727
-
2728
 
2729
  You can then use the model to compute embeddings like this:
2730
 
2731
-
2732
  import { pipeline, cos_sim } from '@xenova/transformers';
2733
 
2734
  // Create a feature extraction pipeline
@@ -2750,13 +2751,13 @@ const output = await extractor(docs, { pooling: 'cls' });
2750
  const [source_embeddings, ...document_embeddings ] = output.tolist();
2751
  const similarities = document_embeddings.map(x => cos_sim(source_embeddings, x));
2752
  console.log(similarities); // [0.7919578577247139, 0.6369278664248345, 0.16512018371357193, 0.3620778366720027]
2753
-
2754
 
2755
  ### Using API
2756
 
2757
  You can use the model via our API as follows:
2758
 
2759
-
2760
  from mixedbread_ai.client import MixedbreadAI, EncodingFormat
2761
  from sklearn.metrics.pairwise import cosine_similarity
2762
  import os
@@ -2778,9 +2779,17 @@ res = mxbai.embeddings(
2778
 
2779
  encoded_embeddings = res.data[0].embedding
2780
  print(res.dimensions, encoded_embeddings.ubinary, encoded_embeddings.float_, encoded_embeddings.int_8)
2781
-
2782
 
2783
  The API comes with native int8 and binary quantization support! Check out the [docs](https://mixedbread.ai/docs) for more information.
 
 
 
 
 
 
 
 
2784
  ## Evaluation
2785
  As of March 2024, our model archives SOTA performance for Bert-large sized models on the [MTEB](https://huggingface.co/spaces/mteb/leaderboard). It ourperforms commercial models like OpenAIs text-embedding-3-large and matches the performance of model 20x it's size like the [echo-mistral-7b](https://huggingface.co/jspringer/echo-mistral-7b-instruct-lasttoken). Our model was trained with no overlap of the MTEB data, which indicates that our model generalizes well across several domains, tasks and text length. We know there are some limitations with this model, which will be fixed in v2.
2786
 
 
2665
 
2666
  similarities = cos_sim(query_embedding, docs_embeddings)
2667
  print('similarities:', similarities)
2668
+ ```
2669
 
2670
  ### Transformers
2671
 
2672
+ ```python
2673
  from typing import Dict
2674
 
2675
  import torch
 
2717
 
2718
  similarities = cos_sim(embeddings[0], embeddings[1:])
2719
  print('similarities:', similarities)
2720
+ ```
2721
 
2722
  ### Transformers.js
2723
 
2724
  If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@xenova/transformers) using:
2725
 
2726
+ ```
2727
  npm i @xenova/transformers
2728
+ ```
2729
 
2730
  You can then use the model to compute embeddings like this:
2731
 
2732
+ ```javascript
2733
  import { pipeline, cos_sim } from '@xenova/transformers';
2734
 
2735
  // Create a feature extraction pipeline
 
2751
  const [source_embeddings, ...document_embeddings ] = output.tolist();
2752
  const similarities = document_embeddings.map(x => cos_sim(source_embeddings, x));
2753
  console.log(similarities); // [0.7919578577247139, 0.6369278664248345, 0.16512018371357193, 0.3620778366720027]
2754
+ ```
2755
 
2756
  ### Using API
2757
 
2758
  You can use the model via our API as follows:
2759
 
2760
+ ```python
2761
  from mixedbread_ai.client import MixedbreadAI, EncodingFormat
2762
  from sklearn.metrics.pairwise import cosine_similarity
2763
  import os
 
2779
 
2780
  encoded_embeddings = res.data[0].embedding
2781
  print(res.dimensions, encoded_embeddings.ubinary, encoded_embeddings.float_, encoded_embeddings.int_8)
2782
+ ```
2783
 
2784
  The API comes with native int8 and binary quantization support! Check out the [docs](https://mixedbread.ai/docs) for more information.
2785
+
2786
+ ### Infinity
2787
+ ```bash
2788
+ docker run --gpus all -v $PWD/data:/app/.cache -p "7997":"7997" \
2789
+ michaelf34/infinity:0.0.68 \
2790
+ v2 --model-id mixedbread-ai/mxbai-embed-large-v1 --revision "main" --dtype float16 --engine torch --port 7997
2791
+ ```
2792
+
2793
  ## Evaluation
2794
  As of March 2024, our model archives SOTA performance for Bert-large sized models on the [MTEB](https://huggingface.co/spaces/mteb/leaderboard). It ourperforms commercial models like OpenAIs text-embedding-3-large and matches the performance of model 20x it's size like the [echo-mistral-7b](https://huggingface.co/jspringer/echo-mistral-7b-instruct-lasttoken). Our model was trained with no overlap of the MTEB data, which indicates that our model generalizes well across several domains, tasks and text length. We know there are some limitations with this model, which will be fixed in v2.
2795