|
# Model Card: carsonpoole/binary-embeddings |
|
|
|
## Model Description |
|
|
|
The `carsonpoole/binary-embeddings` model is designed to generate binary embeddings for text data. This model is useful for tasks that require efficient storage and retrieval of text representations, such as information retrieval, document classification, and clustering. |
|
|
|
## Model Details |
|
|
|
- **Model Name**: Binary Embeddings |
|
- **Model ID**: carsonpoole/binary-embeddings |
|
- **Model Type**: Embedding Model |
|
- **License**: [MIT License](https://opensource.org/licenses/MIT) |
|
- **Author**: Carson Poole |
|
|
|
## Intended Use |
|
|
|
### Primary Use Case |
|
|
|
The primary use case for this model is to generate binary embeddings for text data. These embeddings can be used in various downstream tasks, including: |
|
|
|
- Information retrieval |
|
- Document classification |
|
- Clustering |
|
|
|
### Input |
|
|
|
The model expects input text data in the form of strings. |
|
|
|
### Output |
|
|
|
The model outputs binary embeddings, which are fixed-size binary vectors representing the input text. |
|
|
|
## How to Use |
|
|
|
To use this model, you can load it with the `transformers` library and generate embeddings for your text data. Here is an example: |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModel |
|
|
|
# Load the tokenizer and model |
|
tokenizer = AutoTokenizer.from_pretrained("carsonpoole/binary-embeddings") |
|
model = AutoModel.from_pretrained("carsonpoole/binary-embeddings") |
|
|
|
# Tokenize the input text |
|
input_text = "This is an example sentence." |
|
inputs = tokenizer(input_text, return_tensors="pt") |
|
|
|
# Generate embeddings |
|
with torch.no_grad(): |
|
embeddings = model(**inputs).last_hidden_state |