juliuslipp commited on
Commit
d41dac6
·
verified ·
1 Parent(s): 456b7cf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -1
README.md CHANGED
@@ -2740,8 +2740,31 @@ console.log(similarities); // [0.7919578577247139, 0.6369278664248345, 0.1651201
2740
 
2741
  ### Using API
2742
 
2743
- You’ll be able to use the models through our API as well. The API is coming soon and will have some exciting features. Stay tuned!
2744
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2745
 
2746
  ## Evaluation
2747
  As of March 2024, our model archives SOTA performance for Bert-large sized models on the [MTEB](https://huggingface.co/spaces/mteb/leaderboard). It ourperforms commercial models like OpenAIs text-embedding-3-large and matches the performance of model 20x it's size like the [echo-mistral-7b](https://huggingface.co/jspringer/echo-mistral-7b-instruct-lasttoken). Our model was trained with no overlap of the MTEB data, which indicates that our model generalizes well across several domains, tasks and text length. We know there are some limitations with this model, which will be fixed in v2.
 
2740
 
2741
  ### Using API
2742
 
2743
+ You can use the Model via our API as follows.
2744
 
2745
+ ```python
2746
+ from mixedbread_ai.client import MixedbreadAI
2747
+ from sklearn.metrics.pairwise import cosine_similarity
2748
+ import os
2749
+
2750
+ mxbai = MixedbreadAI(api_key="{MIXEDBREAD_API_KEY}")
2751
+
2752
+ english_sentences = [
2753
+ 'What is the capital of Australia?',
2754
+ 'Canberra is the capital of Australia.'
2755
+ ]
2756
+
2757
+ res = mxbai.embeddings(
2758
+ input=english_sentences,
2759
+ model="mixedbread-ai/mxbai-embed-large-v1"
2760
+ )
2761
+ embeddings = [entry.embedding for entry in res.data]
2762
+
2763
+ similarities = cosine_similarity([embeddings[0]], [embeddings[1]])
2764
+ print(similarities)
2765
+ ```
2766
+
2767
+ The API comes with native INT8 and binary quantization support!
2768
 
2769
  ## Evaluation
2770
  As of March 2024, our model archives SOTA performance for Bert-large sized models on the [MTEB](https://huggingface.co/spaces/mteb/leaderboard). It ourperforms commercial models like OpenAIs text-embedding-3-large and matches the performance of model 20x it's size like the [echo-mistral-7b](https://huggingface.co/jspringer/echo-mistral-7b-instruct-lasttoken). Our model was trained with no overlap of the MTEB data, which indicates that our model generalizes well across several domains, tasks and text length. We know there are some limitations with this model, which will be fixed in v2.