Upload folder using huggingface_hub
Browse files- .gitattributes +3 -0
- README.md +57 -0
- config.json +3 -0
- model.safetensors +3 -0
- modules.json +14 -0
- pipeline.skops +3 -0
- sample_data/README.md +19 -0
- sample_data/anscombe.json +49 -0
- sample_data/california_housing_test.csv +0 -0
- sample_data/california_housing_train.csv +0 -0
- sample_data/mnist_test.csv +3 -0
- sample_data/mnist_train_small.csv +3 -0
- tokenizer.json +0 -0
.gitattributes
CHANGED
@@ -33,3 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
pipeline.skops filter=lfs diff=lfs merge=lfs -text
|
37 |
+
sample_data/mnist_test.csv filter=lfs diff=lfs merge=lfs -text
|
38 |
+
sample_data/mnist_train_small.csv filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
@@ -0,0 +1,57 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
base_model: unknown
|
3 |
+
library_name: model2vec
|
4 |
+
license: mit
|
5 |
+
model_name: my_classifier_pipeline
|
6 |
+
tags:
|
7 |
+
- embeddings
|
8 |
+
- static-embeddings
|
9 |
+
- sentence-transformers
|
10 |
+
---
|
11 |
+
|
12 |
+
# my_classifier_pipeline Model Card
|
13 |
+
|
14 |
+
This [Model2Vec](https://github.com/MinishLab/model2vec) model is a fine-tuned version of the [unknown](https://huggingface.co/unknown) Model2Vec model. It also includes a classifier head on top.
|
15 |
+
|
16 |
+
## Installation
|
17 |
+
|
18 |
+
Install model2vec using pip:
|
19 |
+
```
|
20 |
+
pip install model2vec[inference]
|
21 |
+
```
|
22 |
+
|
23 |
+
## Usage
|
24 |
+
Load this model using the `from_pretrained` method:
|
25 |
+
```python
|
26 |
+
from model2vec.inference import StaticModelPipeline
|
27 |
+
|
28 |
+
# Load a pretrained Model2Vec model
|
29 |
+
model = StaticModelPipeline.from_pretrained("my_classifier_pipeline")
|
30 |
+
|
31 |
+
# Predict labels
|
32 |
+
predicted = model.predict(["Example sentence"])
|
33 |
+
```
|
34 |
+
|
35 |
+
## Additional Resources
|
36 |
+
|
37 |
+
- [Model2Vec Repo](https://github.com/MinishLab/model2vec)
|
38 |
+
- [Model2Vec Base Models](https://huggingface.co/collections/minishlab/model2vec-base-models-66fd9dd9b7c3b3c0f25ca90e)
|
39 |
+
- [Model2Vec Results](https://github.com/MinishLab/model2vec/tree/main/results)
|
40 |
+
- [Model2Vec Tutorials](https://github.com/MinishLab/model2vec/tree/main/tutorials)
|
41 |
+
- [Website](https://minishlab.github.io/)
|
42 |
+
|
43 |
+
## Library Authors
|
44 |
+
|
45 |
+
Model2Vec was developed by the [Minish Lab](https://github.com/MinishLab) team consisting of [Stephan Tulkens](https://github.com/stephantul) and [Thomas van Dongen](https://github.com/Pringled).
|
46 |
+
|
47 |
+
## Citation
|
48 |
+
|
49 |
+
Please cite the [Model2Vec repository](https://github.com/MinishLab/model2vec) if you use this model in your work.
|
50 |
+
```
|
51 |
+
@article{minishlab2024model2vec,
|
52 |
+
author = {Tulkens, Stephan and {van Dongen}, Thomas},
|
53 |
+
title = {Model2Vec: Fast State-of-the-Art Static Embeddings},
|
54 |
+
year = {2024},
|
55 |
+
url = {https://github.com/MinishLab/model2vec}
|
56 |
+
}
|
57 |
+
```
|
config.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"normalize": true
|
3 |
+
}
|
model.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:865097bc1b1d5792f842bd245048d774360bd61bcc11530bd53c21af62fc300d
|
3 |
+
size 129210456
|
modules.json
ADDED
@@ -0,0 +1,14 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
[
|
2 |
+
{
|
3 |
+
"idx": 0,
|
4 |
+
"name": "0",
|
5 |
+
"path": ".",
|
6 |
+
"type": "sentence_transformers.models.StaticEmbedding"
|
7 |
+
},
|
8 |
+
{
|
9 |
+
"idx": 1,
|
10 |
+
"name": "1",
|
11 |
+
"path": "1_Normalize",
|
12 |
+
"type": "sentence_transformers.models.Normalize"
|
13 |
+
}
|
14 |
+
]
|
pipeline.skops
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8edb2ac2d5dcea3012551bcd4503b30740af690899b715588dc56fc4addbaad1
|
3 |
+
size 7440466
|
sample_data/README.md
ADDED
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
This directory includes a few sample datasets to get you started.
|
2 |
+
|
3 |
+
* `california_housing_data*.csv` is California housing data from the 1990 US
|
4 |
+
Census; more information is available at:
|
5 |
+
https://docs.google.com/document/d/e/2PACX-1vRhYtsvc5eOR2FWNCwaBiKL6suIOrxJig8LcSBbmCbyYsayia_DvPOOBlXZ4CAlQ5nlDD8kTaIDRwrN/pub
|
6 |
+
|
7 |
+
* `mnist_*.csv` is a small sample of the
|
8 |
+
[MNIST database](https://en.wikipedia.org/wiki/MNIST_database), which is
|
9 |
+
described at: http://yann.lecun.com/exdb/mnist/
|
10 |
+
|
11 |
+
* `anscombe.json` contains a copy of
|
12 |
+
[Anscombe's quartet](https://en.wikipedia.org/wiki/Anscombe%27s_quartet); it
|
13 |
+
was originally described in
|
14 |
+
|
15 |
+
Anscombe, F. J. (1973). 'Graphs in Statistical Analysis'. American
|
16 |
+
Statistician. 27 (1): 17-21. JSTOR 2682899.
|
17 |
+
|
18 |
+
and our copy was prepared by the
|
19 |
+
[vega_datasets library](https://github.com/altair-viz/vega_datasets/blob/4f67bdaad10f45e3549984e17e1b3088c731503d/vega_datasets/_data/anscombe.json).
|
sample_data/anscombe.json
ADDED
@@ -0,0 +1,49 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
[
|
2 |
+
{"Series":"I", "X":10.0, "Y":8.04},
|
3 |
+
{"Series":"I", "X":8.0, "Y":6.95},
|
4 |
+
{"Series":"I", "X":13.0, "Y":7.58},
|
5 |
+
{"Series":"I", "X":9.0, "Y":8.81},
|
6 |
+
{"Series":"I", "X":11.0, "Y":8.33},
|
7 |
+
{"Series":"I", "X":14.0, "Y":9.96},
|
8 |
+
{"Series":"I", "X":6.0, "Y":7.24},
|
9 |
+
{"Series":"I", "X":4.0, "Y":4.26},
|
10 |
+
{"Series":"I", "X":12.0, "Y":10.84},
|
11 |
+
{"Series":"I", "X":7.0, "Y":4.81},
|
12 |
+
{"Series":"I", "X":5.0, "Y":5.68},
|
13 |
+
|
14 |
+
{"Series":"II", "X":10.0, "Y":9.14},
|
15 |
+
{"Series":"II", "X":8.0, "Y":8.14},
|
16 |
+
{"Series":"II", "X":13.0, "Y":8.74},
|
17 |
+
{"Series":"II", "X":9.0, "Y":8.77},
|
18 |
+
{"Series":"II", "X":11.0, "Y":9.26},
|
19 |
+
{"Series":"II", "X":14.0, "Y":8.10},
|
20 |
+
{"Series":"II", "X":6.0, "Y":6.13},
|
21 |
+
{"Series":"II", "X":4.0, "Y":3.10},
|
22 |
+
{"Series":"II", "X":12.0, "Y":9.13},
|
23 |
+
{"Series":"II", "X":7.0, "Y":7.26},
|
24 |
+
{"Series":"II", "X":5.0, "Y":4.74},
|
25 |
+
|
26 |
+
{"Series":"III", "X":10.0, "Y":7.46},
|
27 |
+
{"Series":"III", "X":8.0, "Y":6.77},
|
28 |
+
{"Series":"III", "X":13.0, "Y":12.74},
|
29 |
+
{"Series":"III", "X":9.0, "Y":7.11},
|
30 |
+
{"Series":"III", "X":11.0, "Y":7.81},
|
31 |
+
{"Series":"III", "X":14.0, "Y":8.84},
|
32 |
+
{"Series":"III", "X":6.0, "Y":6.08},
|
33 |
+
{"Series":"III", "X":4.0, "Y":5.39},
|
34 |
+
{"Series":"III", "X":12.0, "Y":8.15},
|
35 |
+
{"Series":"III", "X":7.0, "Y":6.42},
|
36 |
+
{"Series":"III", "X":5.0, "Y":5.73},
|
37 |
+
|
38 |
+
{"Series":"IV", "X":8.0, "Y":6.58},
|
39 |
+
{"Series":"IV", "X":8.0, "Y":5.76},
|
40 |
+
{"Series":"IV", "X":8.0, "Y":7.71},
|
41 |
+
{"Series":"IV", "X":8.0, "Y":8.84},
|
42 |
+
{"Series":"IV", "X":8.0, "Y":8.47},
|
43 |
+
{"Series":"IV", "X":8.0, "Y":7.04},
|
44 |
+
{"Series":"IV", "X":8.0, "Y":5.25},
|
45 |
+
{"Series":"IV", "X":19.0, "Y":12.50},
|
46 |
+
{"Series":"IV", "X":8.0, "Y":5.56},
|
47 |
+
{"Series":"IV", "X":8.0, "Y":7.91},
|
48 |
+
{"Series":"IV", "X":8.0, "Y":6.89}
|
49 |
+
]
|
sample_data/california_housing_test.csv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
sample_data/california_housing_train.csv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
sample_data/mnist_test.csv
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:51c292478d94ec3a01461bdfa82eb0885d262eb09e615679b2d69dedb6ad09e7
|
3 |
+
size 18289443
|
sample_data/mnist_train_small.csv
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1ef64781aa03180f4f5ce504314f058f5d0227277df86060473d973cf43b033e
|
3 |
+
size 36523880
|
tokenizer.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|