zpn commited on
Commit
57c0d5a
·
verified ·
1 Parent(s): 6e08e51

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -126
README.md CHANGED
@@ -6,137 +6,20 @@ tags:
6
  - sentence-transformers
7
  - sentence-similarity
8
  - feature-extraction
 
9
  ---
10
 
11
- # SentenceTransformer based on nomic-ai/nomic-embed-text-v2-moe-unsupervised
12
 
13
- This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [nomic-ai/nomic-embed-text-v2-moe-unsupervised](https://huggingface.co/nomic-ai/nomic-embed-text-v2-moe-unsupervised). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
14
 
15
- ## Model Details
 
16
 
17
- ### Model Description
18
- - **Model Type:** Sentence Transformer
19
- - **Base model:** [nomic-ai/nomic-embed-text-v2-moe-unsupervised](https://huggingface.co/nomic-ai/nomic-embed-text-v2-moe-unsupervised) <!-- at revision e48a32f5906ed18933f85467e57c1dcc02ef401b -->
20
- - **Maximum Sequence Length:** 512 tokens
21
- - **Output Dimensionality:** 768 dimensions
22
- - **Similarity Function:** Cosine Similarity
23
- <!-- - **Training Dataset:** Unknown -->
24
- <!-- - **Language:** Unknown -->
25
- <!-- - **License:** Unknown -->
26
 
27
- ### Model Sources
28
 
29
- - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
30
- - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
31
- - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
32
 
33
- ### Full Model Architecture
34
-
35
- ```
36
- SentenceTransformer(
37
- (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: NomicBertModel
38
- (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
39
- (2): Normalize()
40
- )
41
- ```
42
-
43
- ## Usage
44
-
45
- ### Direct Usage (Sentence Transformers)
46
-
47
- First install the Sentence Transformers library:
48
-
49
- ```bash
50
- pip install -U sentence-transformers
51
- ```
52
-
53
- Then you can load this model and run inference.
54
- ```python
55
- from sentence_transformers import SentenceTransformer
56
-
57
- # Download from the 🤗 Hub
58
- model = SentenceTransformer("nomic-ai/nomic-embed-text-v2-moe-unsupervised")
59
- # Run inference
60
- sentences = [
61
- 'The weather is lovely today.',
62
- "It's so sunny outside!",
63
- 'He drove to the stadium.',
64
- ]
65
- embeddings = model.encode(sentences)
66
- print(embeddings.shape)
67
- # [3, 768]
68
-
69
- # Get the similarity scores for the embeddings
70
- similarities = model.similarity(embeddings, embeddings)
71
- print(similarities.shape)
72
- # [3, 3]
73
- ```
74
-
75
- <!--
76
- ### Direct Usage (Transformers)
77
-
78
- <details><summary>Click to see the direct usage in Transformers</summary>
79
-
80
- </details>
81
- -->
82
-
83
- <!--
84
- ### Downstream Usage (Sentence Transformers)
85
-
86
- You can finetune this model on your own dataset.
87
-
88
- <details><summary>Click to expand</summary>
89
-
90
- </details>
91
- -->
92
-
93
- <!--
94
- ### Out-of-Scope Use
95
-
96
- *List how the model may foreseeably be misused and address what users ought not to do with the model.*
97
- -->
98
-
99
- <!--
100
- ## Bias, Risks and Limitations
101
-
102
- *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
103
- -->
104
-
105
- <!--
106
- ### Recommendations
107
-
108
- *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
109
- -->
110
-
111
- ## Training Details
112
-
113
- ### Framework Versions
114
- - Python: 3.10.12
115
- - Sentence Transformers: 3.3.0
116
- - Transformers: 4.44.2
117
- - PyTorch: 2.4.1+cu121
118
- - Accelerate: 1.2.1
119
- - Datasets: 3.2.0
120
- - Tokenizers: 0.19.1
121
-
122
- ## Citation
123
-
124
- ### BibTeX
125
-
126
- <!--
127
- ## Glossary
128
-
129
- *Clearly define terms in order to be accessible across audiences.*
130
- -->
131
-
132
- <!--
133
- ## Model Card Authors
134
-
135
- *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
136
- -->
137
-
138
- <!--
139
- ## Model Card Contact
140
-
141
- *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
142
- -->
 
6
  - sentence-transformers
7
  - sentence-similarity
8
  - feature-extraction
9
+ new_version: nomic-ai/nomic-embed-text-v2-moe
10
  ---
11
 
 
12
 
13
+ # nomic-embed-text-v2-moe-unsupervised
14
 
15
+ `nomic-embed-text-v2-moe-unsupervised` is multilingual MoE Text Embedding model. This is a checkpoint after contrastive pretraining from multi-stage contrastive training of the
16
+ [final model](https://huggingface.co/nomic-ai/nomic-embed-text-v2-moe).
17
 
18
+ If you want to use a model to extract embeddings, we suggest using [nomic-embed-text-v2-moe](https://huggingface.co/nomic-ai/nomic-embed-text-v2-moe)
 
 
 
 
 
 
 
 
19
 
 
20
 
21
+ # Join the Nomic Community
 
 
22
 
23
+ - Nomic: [https://nomic.ai](https://nomic.ai)
24
+ - Discord: [https://discord.gg/myY5YDR8z8](https://discord.gg/myY5YDR8z8)
25
+ - Twitter: [https://twitter.com/nomic_ai](https://twitter.com/nomic_ai)