Add SetFit model
Browse files- README.md +52 -150
- config.json +1 -1
- model.safetensors +1 -1
- model_head.pkl +1 -1
README.md
CHANGED
@@ -11,136 +11,41 @@ metrics:
|
|
11 |
- recall
|
12 |
- f1
|
13 |
widget:
|
14 |
-
- text:
|
15 |
-
|
16 |
-
|
17 |
-
|
18 |
-
|
19 |
-
|
20 |
-
|
21 |
-
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
-
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
|
44 |
-
|
45 |
-
|
46 |
-
|
47 |
-
|
48 |
-
|
49 |
-
\ Cannot take the length of Shape with unknown rank.</code> inside the keras'es\
|
50 |
-
\ <code>_standardize_user_data</code> function.</p>\n\n<p>I have tried quite a\
|
51 |
-
\ few things but could not resolve the issue. Any ideas?</p>\n\n<p><strong>Edit:</strong>\
|
52 |
-
\ based on @kvish's answer, the solution was to change the map from a lambda to\
|
53 |
-
\ a function that would specify the correct tensor dimensions, e.g.:</p>\n\n<pre\
|
54 |
-
\ class=\"lang-py prettyprint-override\"><code>def data_loader(filename):\n \
|
55 |
-
\ def loader_impl(filename):\n features, labels, _ = load_hdf5(filename)\n\
|
56 |
-
\ ...\n return features, labels\n\n features, labels = tf.py_func(loader_impl,\
|
57 |
-
\ [filename], [tf.float32, tf.float32])\n features.set_shape((None, 100))\n\
|
58 |
-
\ labels.set_shape((None, 1))\n return features, labels\n</code></pre>\n\
|
59 |
-
\n<p>and now, all needed to do is to call this function from <code>map</code>:</p>\n\
|
60 |
-
\n<pre class=\"lang-py prettyprint-override\"><code>dataset = dataset.map(data_loader)\n\
|
61 |
-
</code></pre>\n"
|
62 |
-
- text: "<p>I'm wondering what the current available options are for simulating BatchNorm\
|
63 |
-
\ folding during quantization aware training in Tensorflow 2. Tensorflow 1 has\
|
64 |
-
\ the <code>tf.contrib.quantize.create_training_graph</code> function which inserts\
|
65 |
-
\ FakeQuantization layers into the graph and takes care of simulating batch normalization\
|
66 |
-
\ folding (according to this <a href=\"https://arxiv.org/pdf/1806.08342.pdf\"\
|
67 |
-
\ rel=\"noreferrer\">white paper</a>).</p>\n\n<p>Tensorflow 2 has a <a href=\"\
|
68 |
-
https://www.tensorflow.org/model_optimization/guide/quantization/training_comprehensive_guide.md\"\
|
69 |
-
\ rel=\"noreferrer\">tutorial</a> on how to use quantization in their recently\
|
70 |
-
\ adopted <code>tf.keras</code> API, but they don't mention anything about batch\
|
71 |
-
\ normalization. I tried the following simple example with a BatchNorm layer:</p>\n\
|
72 |
-
\n<pre><code>import tensorflow_model_optimization as tfmo\n\nmodel = tf.keras.Sequential([\n\
|
73 |
-
\ l.Conv2D(32, 5, padding='same', activation='relu', input_shape=input_shape),\n\
|
74 |
-
\ l.MaxPooling2D((2, 2), (2, 2), padding='same'),\n l.Conv2D(64, 5,\
|
75 |
-
\ padding='same', activation='relu'),\n l.BatchNormalization(), # BN!\n\
|
76 |
-
\ l.MaxPooling2D((2, 2), (2, 2), padding='same'),\n l.Flatten(),\n \
|
77 |
-
\ l.Dense(1024, activation='relu'),\n l.Dropout(0.4),\n l.Dense(num_classes),\n\
|
78 |
-
\ l.Softmax(),\n])\nmodel = tfmo.quantization.keras.quantize_model(model)\n\
|
79 |
-
</code></pre>\n\n<p>It however gives the following exception:</p>\n\n<pre><code>RuntimeError:\
|
80 |
-
\ Layer batch_normalization:<class 'tensorflow.python.keras.layers.normalization.BatchNormalization'>\
|
81 |
-
\ is not supported. You can quantize this layer by passing a `tfmot.quantization.keras.QuantizeConfig`\
|
82 |
-
\ instance to the `quantize_annotate_layer` API.\n</code></pre>\n\n<p>which indicates\
|
83 |
-
\ that TF does not know what to do with it.</p>\n\n<p>I also saw <a href=\"https://stackoverflow.com/questions/52259343/quantize-a-keras-neural-network-model/57785739#57785739\"\
|
84 |
-
>this related topic</a> where they apply <code>tf.contrib.quantize.create_training_graph</code>\
|
85 |
-
\ on a keras constructed model. They however don't use BatchNorm layers, so I'm\
|
86 |
-
\ not sure this will work.</p>\n\n<p>So what are the options for using this BatchNorm\
|
87 |
-
\ folding feature in TF2? Can this be done from the keras API, or should I switch\
|
88 |
-
\ back to the TensorFlow 1 API and define a graph the old way?</p>\n"
|
89 |
-
- text: '<p>How can I get the file name of a <a href="https://www.tensorflow.org/api_docs/python/tf/summary/FileWriter"
|
90 |
-
rel="nofollow noreferrer"><code>tf.summary.FileWriter</code></a> (<a href="https://web.archive.org/web/20170321224015/https://www.tensorflow.org/api_docs/python/tf/summary/FileWriter"
|
91 |
-
rel="nofollow noreferrer">mirror</a>) in TensorFlow?</p>
|
92 |
-
|
93 |
-
|
94 |
-
<p>I am aware that I can use <a href="https://www.tensorflow.org/api_docs/python/tf/summary/FileWriter#get_logdir"
|
95 |
-
rel="nofollow noreferrer"><code>get_logdir()</code></a> but I don''t see any
|
96 |
-
similar method to access the file name.</p>
|
97 |
-
|
98 |
-
'
|
99 |
-
- text: "<p>Will future versions of tensorflow provide a way to run the tensorflow\
|
100 |
-
\ graph generated by single node tf.sess on a distributed environments with multiple\
|
101 |
-
\ ps nodes and worker nodes through python interfaces?\nOr is it supported right\
|
102 |
-
\ now?</p>\n\n<p>I am trying to build my tf.graph on my notebook (single node)\
|
103 |
-
\ and save then graph into a binary file, \nand then loading the binary graph\
|
104 |
-
\ into a distributed environment (with multiply ps and worker nodes) to train\
|
105 |
-
\ and verify it. It seems it is not supported now.</p>\n\n<p>I tried it on tensorflow-0.10\
|
106 |
-
\ and failed.\nBy using</p>\n\n<pre><code>tf.train.write_graph(sess.graph_def,\
|
107 |
-
\ path, pb_name)\n</code></pre>\n\n<p>interface: The graph saved is not trainable\
|
108 |
-
\ as loading the <code>.pb</code> file through <code>import_graph_def</code> will\
|
109 |
-
\ only <code>g.create_ops</code> according to the <code>.bp</code> file but not\
|
110 |
-
\ add then into <code>ops.collections</code>. So the graph loaded is not trainable.</p>\n\
|
111 |
-
\n<p>By using <code>tf.saver.save</code> to save a <code>.meta</code> file: The\
|
112 |
-
\ loaded graph cannot fit into the distributed environment as devices assignment\
|
113 |
-
\ is messy.</p>\n\n<p>I tried the</p>\n\n<pre><code>tf.train.import_meta_graph('test_model.meta',\
|
114 |
-
\ clear_devices=True)\n</code></pre>\n\n<p>interface to let the load clean the\
|
115 |
-
\ original device assignment and let the <code>with tf.device(device_setter)</code>\
|
116 |
-
\ reassign the device for each variable, but there is a problem as operations\
|
117 |
-
\ belonging to <code>Saver</code> and <code>Restore</code> still can not be assigned\
|
118 |
-
\ correctly. When creating operations for <code>Saver</code> and <code>Restore</code>\
|
119 |
-
\ ops through <code>g.create_op</code> inside <code>import_graph_def</code> called\
|
120 |
-
\ by <code>import_meta_graph</code>, the device_setter will not assign ps node\
|
121 |
-
\ to these ops as their name is not <code>Variable</code>.\nIs there any way to\
|
122 |
-
\ do so?</p>\n"
|
123 |
-
- text: "<p>I use <code>freeze_graph</code> to export my model to a file named <code>\"\
|
124 |
-
frozen.pb\"</code>. But Found that the accuracy of predictions on <code>frozen.pb</code>\
|
125 |
-
\ is very bad.</p>\n\n<p>I know the problem maybe <code>MovingAverage</code> not\
|
126 |
-
\ included in <code>frozen.pb</code>.</p>\n\n<p>When I use <code>model.ckpt</code>\
|
127 |
-
\ files to restore model for evaluating, if I call <code>tf.train.ExponentialMovingAverage(0.999)</code>\
|
128 |
-
\ , then the accuracy is good as expected, else the accuracy is bad.</p>\n\n<p><strong>So\
|
129 |
-
\ How To export a binary model which performance is the same as the one restored\
|
130 |
-
\ from checkpoint files?</strong> I want to use <code>\".pb\"</code> files in\
|
131 |
-
\ Android Devices.</p>\n\n<p><a href=\"https://www.tensorflow.org/versions/r0.12/api_docs/python/train/moving_averages\"\
|
132 |
-
\ rel=\"nofollow noreferrer\">The official document</a> doesn't mention this.</p>\n\
|
133 |
-
\n<p>Thanks!!</p>\n\n<p>Freeze Command:</p>\n\n<pre><code>~/bazel-bin/tensorflow/python/tools/freeze_graph\
|
134 |
-
\ \\\n --input_graph=./graph.pbtxt \\\n --input_checkpoint=./model.ckpt-100000\
|
135 |
-
\ \\\n --output_graph=frozen.pb \\\n --output_node_names=output \\\n --restore_op_name=save/restore_all\
|
136 |
-
\ \\\n --clear_devices\n</code></pre>\n\n<p>Evaluate Code:</p>\n\n<pre><code>...\
|
137 |
-
\ ...\nlogits = carc19.inference(images)\ntop_k = tf.nn.top_k(logits, k=10)\n\n\
|
138 |
-
# Precision: 97%\n# Restore the moving average version of the learned variables\
|
139 |
-
\ for eval.\nvariable_averages = tf.train.ExponentialMovingAverage(carc19.MOVING_AVERAGE_DECAY)\n\
|
140 |
-
variables_to_restore = variable_averages.variables_to_restore()\nfor k in variables_to_restore.keys():\n\
|
141 |
-
\ print (k,variables_to_restore[k])\nsaver = tf.train.Saver(variables_to_restore)\n\
|
142 |
-
\n# Precision: 84%\n#saver = tf.train.Saver()\n\n#model_path = '/tmp/carc19_train/model.ckpt-9801'\n\
|
143 |
-
with tf.Session() as sess:\n saver.restore(sess, model_path)\n... ...\n</code></pre>\n"
|
144 |
pipeline_tag: text-classification
|
145 |
inference: true
|
146 |
base_model: flax-sentence-embeddings/stackoverflow_mpnet-base
|
@@ -156,16 +61,16 @@ model-index:
|
|
156 |
split: test
|
157 |
metrics:
|
158 |
- type: accuracy
|
159 |
-
value: 0.
|
160 |
name: Accuracy
|
161 |
- type: precision
|
162 |
-
value: 0.
|
163 |
name: Precision
|
164 |
- type: recall
|
165 |
-
value: 0.
|
166 |
name: Recall
|
167 |
- type: f1
|
168 |
-
value: 0.
|
169 |
name: F1
|
170 |
---
|
171 |
|
@@ -197,17 +102,17 @@ The model has been trained using an efficient few-shot learning technique that i
|
|
197 |
- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
|
198 |
|
199 |
### Model Labels
|
200 |
-
| Label | Examples
|
201 |
-
|
202 |
-
|
|
203 |
-
|
|
204 |
|
205 |
## Evaluation
|
206 |
|
207 |
### Metrics
|
208 |
| Label | Accuracy | Precision | Recall | F1 |
|
209 |
|:--------|:---------|:----------|:-------|:-------|
|
210 |
-
| **all** | 0.
|
211 |
|
212 |
## Uses
|
213 |
|
@@ -227,10 +132,7 @@ from setfit import SetFitModel
|
|
227 |
# Download from the 🤗 Hub
|
228 |
model = SetFitModel.from_pretrained("sharukat/so_mpnet-base_question_classifier")
|
229 |
# Run inference
|
230 |
-
preds = model("
|
231 |
-
|
232 |
-
<p>I am aware that I can use <a href=\"https://www.tensorflow.org/api_docs/python/tf/summary/FileWriter#get_logdir\" rel=\"nofollow noreferrer\"><code>get_logdir()</code></a> but I don't see any similar method to access the file name.</p>
|
233 |
-
")
|
234 |
```
|
235 |
|
236 |
<!--
|
@@ -260,14 +162,14 @@ preds = model("<p>How can I get the file name of a <a href=\"https://www.tensorf
|
|
260 |
## Training Details
|
261 |
|
262 |
### Training Set Metrics
|
263 |
-
| Training set | Min | Median | Max
|
264 |
-
|
265 |
-
| Word count |
|
266 |
|
267 |
| Label | Training Sample Count |
|
268 |
|:------|:----------------------|
|
269 |
-
| 0 |
|
270 |
-
| 1 |
|
271 |
|
272 |
### Training Hyperparameters
|
273 |
- batch_size: (8, 8)
|
@@ -290,8 +192,8 @@ preds = model("<p>How can I get the file name of a <a href=\"https://www.tensorf
|
|
290 |
### Training Results
|
291 |
| Epoch | Step | Training Loss | Validation Loss |
|
292 |
|:-------:|:---------:|:-------------:|:---------------:|
|
293 |
-
| 0.
|
294 |
-
| **1.0** | **
|
295 |
|
296 |
* The bold row denotes the saved checkpoint.
|
297 |
### Framework Versions
|
|
|
11 |
- recall
|
12 |
- f1
|
13 |
widget:
|
14 |
+
- text: 'I''m trying to take a dataframe and convert them to tensors to train a model
|
15 |
+
in keras. I think it''s being triggered when I am converting my Y label to a tensor:
|
16 |
+
I''m getting the following error when casting y_train to tensor from slices: In
|
17 |
+
the tutorials this seems to work but I think those tutorials are doing multiclass
|
18 |
+
classifications whereas I''m doing a regression so y_train is a series not multiple
|
19 |
+
columns. Any suggestions of what I can do?'
|
20 |
+
- text: My weights are defined as I want to use the weights decay so I add, for example,
|
21 |
+
the argument to the tf.get_variable. Now I'm wondering if during the evaluation
|
22 |
+
phase this is still correct or maybe I have to set the regularizer factor to 0.
|
23 |
+
There is also another argument trainable. The documentation says If True also
|
24 |
+
add the variable to the graph collection GraphKeys.TRAINABLE_VARIABLES. which
|
25 |
+
is not clear to me. Should I use it? Can someone explain to me if the weights
|
26 |
+
decay effects in a sort of wrong way the evaluation step? How can I solve in that
|
27 |
+
case?
|
28 |
+
- text: 'Maybe I''m confused about what "inner" and "outer" tensor dimensions are,
|
29 |
+
but the documentation for tf.matmul puzzles me: Isn''t it the case that R-rank
|
30 |
+
arguments need to have matching (or no) R-2 outer dimensions, and that (as in
|
31 |
+
normal matrix multiplication) the Rth, inner dimension of the first argument must
|
32 |
+
match the R-1st dimension of the second. That is, in The outer dimensions a, ...,
|
33 |
+
z must be identical to a'', ..., z'' (or not exist), and x and x'' must match
|
34 |
+
(while p and q can be anything). Or put another way, shouldn''t the docs say:'
|
35 |
+
- text: 'I am using tf.data with reinitializable iterator to handle training and dev
|
36 |
+
set data. For each epoch, I initialize the training data set. The official documentation
|
37 |
+
has similar structure. I think this is not efficient especially if the training
|
38 |
+
set is large. Some of the resources I found online has sess.run(train_init_op,
|
39 |
+
feed_dict={X: X_train, Y: Y_train}) before the for loop to avoid this issue. But
|
40 |
+
then we can''t process the dev set after each epoch; we can only process it after
|
41 |
+
we are done iterating over epochs epochs. Is there a way to efficiently process
|
42 |
+
the dev set after each epoch?'
|
43 |
+
- text: 'Why is the pred variable being calculated before any of the training iterations
|
44 |
+
occur? I would expect that a pred would be generated (through the RNN() function)
|
45 |
+
during each pass through of the data for every iteration? There must be something
|
46 |
+
I am missing. Is pred something like a function object? I have looked at the docs
|
47 |
+
for tf.matmul() and that returns a tensor, not a function. Full source: https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/3_NeuralNetworks/recurrent_network.py
|
48 |
+
Here is the code:'
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
49 |
pipeline_tag: text-classification
|
50 |
inference: true
|
51 |
base_model: flax-sentence-embeddings/stackoverflow_mpnet-base
|
|
|
61 |
split: test
|
62 |
metrics:
|
63 |
- type: accuracy
|
64 |
+
value: 0.81875
|
65 |
name: Accuracy
|
66 |
- type: precision
|
67 |
+
value: 0.8248924988055423
|
68 |
name: Precision
|
69 |
- type: recall
|
70 |
+
value: 0.81875
|
71 |
name: Recall
|
72 |
- type: f1
|
73 |
+
value: 0.8178892421209625
|
74 |
name: F1
|
75 |
---
|
76 |
|
|
|
102 |
- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
|
103 |
|
104 |
### Model Labels
|
105 |
+
| Label | Examples |
|
106 |
+
|:------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
107 |
+
| 1 | <ul><li>'In tf.gradients, there is a keyword argument grad_ys Why is grads_ys needed here? The docs here is implicit. Could you please give some specific purpose and code? And my example code for tf.gradients is'</li><li>'I am coding a Convolutional Neural Network to classify images in TensorFlow but there is a problem: When I try to feed my NumPy array of flattened images (3 channels with RGB values from 0 to 255) to a tf.estimator.inputs.numpy_input_fn I get the following error: My numpy_imput_fn looks like this: In the documentation for the function it is said that x should be a dict of NumPy array:'</li><li>'I am trying to use tf.pad. Here is my attempt to pad the tensor to length 20, with values 10. I get this error message I am looking at the documentation https://www.tensorflow.org/api_docs/python/tf/pad But I am unable to figure out how to shape the pad value'</li></ul> |
|
108 |
+
| 0 | <ul><li>"I am trying to use tf.train.shuffle_batch to consume batches of data from a TFRecord file using TensorFlow 1.0. The relevant functions are: The code enters through examine_batches(), having been handed the output of batch_generator(). batch_generator() calls tfrecord_to_graph_ops() and the problem is in that function, I believe. I am calling on a file with 1,000 bytes (numbers 0-9). If I call eval() on this in a Session, it shows me all 1,000 elements. But if I try to put it in a batch generator, it crashes. If I don't reshape targets, I get an error like ValueError: All shapes must be fully defined when tf.train.shuffle_batch is called. If I call targets.set_shape([1]), reminiscent of Google's CIFAR-10 example code, I get an error like Invalid argument: Shape mismatch in tuple component 0. Expected [1], got [1000] in tf.train.shuffle_batch. I also tried using tf.strided_slice to cut a chunk of the raw data - this doesn't crash but it results in just getting the first event over and over again. What is the right way to do this? To pull batches from a TFRecord file? Note, I could manually write a function that chopped up the raw byte data and did some sort of batching - especially easy if I am using the feed_dict approach to getting data into the graph - but I am trying to learn how to use TensorFlow's TFRecord files and how to use their built in batching functions. Thanks!"</li><li>"I am fairly new to TF and ML in general, so I have relied heavily on the documentation and tutorials provided by TF. I have been following along with the Tensorflow 2.0 Objection Detection API tutorial to the letter and have encountered an issue while training: everytime I run the training script model_main_tf2.py, it always hangs after the output: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2) after a number of depreciation warnings. I have tried many different ways of fixing this, including modifying the train script and pipeline.config files. My dataset isn't very large, less than 100 images with a max of 15 labels per image. useful info: Python 3.8.0 Tensorflow 2.4.4 (Non GPU) Windows 10 Pro Any and all help is appreciated!"</li><li>'I found two solutions to calculate FLOPS of Keras models (TF 2.x): [1] https://github.com/tensorflow/tensorflow/issues/32809#issuecomment-849439287 [2] https://github.com/tensorflow/tensorflow/issues/32809#issuecomment-841975359 At first glance, both seem to work perfectly when testing with tf.keras.applications.ResNet50(). The resulting FLOPS are identical and correspond to the FLOPS of the ResNet paper. But then I built a small GRU model and found different FLOPS for the two methods: This results in the following numbers: 13206 for method [1] and 18306 for method [2]. That is really confusing... Does anyone know how to correctly calculate FLOPS of recurrent Keras models in TF 2.x? EDIT I found another information: [3] https://github.com/tensorflow/tensorflow/issues/36391#issuecomment-596055100 When adding this argument to convert_variables_to_constants_v2, the outputs of [1] and [2] are the same when using my GRU example. The tensorflow documentation explains this argument as follows (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/framework/convert_to_constants.py): Can someone try to explain this?'</li></ul> |
|
109 |
|
110 |
## Evaluation
|
111 |
|
112 |
### Metrics
|
113 |
| Label | Accuracy | Precision | Recall | F1 |
|
114 |
|:--------|:---------|:----------|:-------|:-------|
|
115 |
+
| **all** | 0.8187 | 0.8249 | 0.8187 | 0.8179 |
|
116 |
|
117 |
## Uses
|
118 |
|
|
|
132 |
# Download from the 🤗 Hub
|
133 |
model = SetFitModel.from_pretrained("sharukat/so_mpnet-base_question_classifier")
|
134 |
# Run inference
|
135 |
+
preds = model("I'm trying to take a dataframe and convert them to tensors to train a model in keras. I think it's being triggered when I am converting my Y label to a tensor: I'm getting the following error when casting y_train to tensor from slices: In the tutorials this seems to work but I think those tutorials are doing multiclass classifications whereas I'm doing a regression so y_train is a series not multiple columns. Any suggestions of what I can do?")
|
|
|
|
|
|
|
136 |
```
|
137 |
|
138 |
<!--
|
|
|
162 |
## Training Details
|
163 |
|
164 |
### Training Set Metrics
|
165 |
+
| Training set | Min | Median | Max |
|
166 |
+
|:-------------|:----|:---------|:----|
|
167 |
+
| Word count | 12 | 128.0219 | 907 |
|
168 |
|
169 |
| Label | Training Sample Count |
|
170 |
|:------|:----------------------|
|
171 |
+
| 0 | 320 |
|
172 |
+
| 1 | 320 |
|
173 |
|
174 |
### Training Hyperparameters
|
175 |
- batch_size: (8, 8)
|
|
|
192 |
### Training Results
|
193 |
| Epoch | Step | Training Loss | Validation Loss |
|
194 |
|:-------:|:---------:|:-------------:|:---------------:|
|
195 |
+
| 0.0000 | 1 | 0.3266 | - |
|
196 |
+
| **1.0** | **25640** | **0.0** | **0.2863** |
|
197 |
|
198 |
* The bold row denotes the saved checkpoint.
|
199 |
### Framework Versions
|
config.json
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
{
|
2 |
-
"_name_or_path": "checkpoints/
|
3 |
"architectures": [
|
4 |
"MPNetModel"
|
5 |
],
|
|
|
1 |
{
|
2 |
+
"_name_or_path": "checkpoints/step_25640",
|
3 |
"architectures": [
|
4 |
"MPNetModel"
|
5 |
],
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 437967672
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d2f525cfb8e8b3946018793494ac6a26049d8ed198c5d20c983cf73ec90efb45
|
3 |
size 437967672
|
model_head.pkl
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 7007
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:016c123d70461936b9b9498918910c3b3d84871e44234b11028d4b736312f54e
|
3 |
size 7007
|