metadata
library_name: setfit
tags:
- setfit
- sentence-transformers
- text-classification
- generated_from_setfit_trainer
metrics:
- accuracy
- precision
- recall
- f1
widget:
- text: >
<p><a
href="https://kwotsin.github.io/tech/2017/02/11/transfer-learning.html"
rel="nofollow
noreferrer">https://kwotsin.github.io/tech/2017/02/11/transfer-learning.html</a>
I followed the above link to make a image classifier</p>
<p>Training code:</p>
<pre><code>slim = tf.contrib.slim
dataset_dir = './data'
log_dir = './log'
checkpoint_file = './inception_resnet_v2_2016_08_30.ckpt'
image_size = 299
num_classes = 21
vlabels_file = './labels.txt'
labels = open(labels_file, 'r')
labels_to_name = {}
for line in labels:
label, string_name = line.split(':')
string_name = string_name[:-1]
labels_to_name[int(label)] = string_name
file_pattern = 'test_%s_*.tfrecord'
items_to_descriptions = {
'image': 'A 3-channel RGB coloured product image',
'label': 'A label that from 20 labels'
}
num_epochs = 10
batch_size = 16
initial_learning_rate = 0.001
learning_rate_decay_factor = 0.7
num_epochs_before_decay = 4
def get_split(split_name, dataset_dir, file_pattern=file_pattern,
file_pattern_for_counting='products'):
if split_name not in ['train', 'validation']:
raise ValueError(
'The split_name %s is not recognized. Please input either train or validation as the split_name' % (
split_name))
file_pattern_path = os.path.join(dataset_dir, file_pattern % (split_name))
num_samples = 0
file_pattern_for_counting = file_pattern_for_counting + '_' + split_name
tfrecords_to_count = [os.path.join(dataset_dir, file) for file in os.listdir(dataset_dir) if
file.startswith(file_pattern_for_counting)]
for tfrecord_file in tfrecords_to_count:
for record in tf.python_io.tf_record_iterator(tfrecord_file):
num_samples += 1
test = num_samples
reader = tf.TFRecordReader
keys_to_features = {
'image/encoded': tf.FixedLenFeature((), tf.string, default_value=''),
'image/format': tf.FixedLenFeature((), tf.string, default_value='jpg'),
'image/class/label': tf.FixedLenFeature(
[], tf.int64, default_value=tf.zeros([], dtype=tf.int64)),
}
items_to_handlers = {
'image': slim.tfexample_decoder.Image(),
'label': slim.tfexample_decoder.Tensor('image/class/label'),
}
decoder = slim.tfexample_decoder.TFExampleDecoder(keys_to_features, items_to_handlers)
labels_to_name_dict = labels_to_name
dataset = slim.dataset.Dataset(
data_sources=file_pattern_path,
decoder=decoder,
reader=reader,
num_readers=4,
num_samples=num_samples,
num_classes=num_classes,
labels_to_name=labels_to_name_dict,
items_to_descriptions=items_to_descriptions)
return dataset
def load_batch(dataset, batch_size, height=image_size, width=image_size,
is_training=True):
'''
Loads a batch for training.
INPUTS:
- dataset(Dataset): a Dataset class object that is created from the get_split function
- batch_size(int): determines how big of a batch to train
- height(int): the height of the image to resize to during preprocessing
- width(int): the width of the image to resize to during preprocessing
- is_training(bool): to determine whether to perform a training or evaluation preprocessing
OUTPUTS:
- images(Tensor): a Tensor of the shape (batch_size, height, width, channels) that contain one batch of images
- labels(Tensor): the batch's labels with the shape (batch_size,) (requires one_hot_encoding).
'''
# First create the data_provider object
data_provider = slim.dataset_data_provider.DatasetDataProvider(
dataset,
common_queue_capacity=24 + 3 * batch_size,
common_queue_min=24)
# Obtain the raw image using the get method
raw_image, label = data_provider.get(['image', 'label'])
image = inception_preprocessing.preprocess_image(raw_image, height, width, is_training)
raw_image = tf.expand_dims(raw_image, 0)
raw_image = tf.image.resize_nearest_neighbor(raw_image, [height, width])
raw_image = tf.squeeze(raw_image)
images, raw_images, labels = tf.train.batch(
[image, raw_image, label],
batch_size=batch_size,
num_threads=4,
capacity=4 * batch_size,
allow_smaller_final_batch=True)
return images, raw_images, labels
def run():
if not os.path.exists(log_dir):
os.mkdir(log_dir)
with tf.Graph().as_default() as graph:
tf.logging.set_verbosity(tf.logging.INFO)
dataset = get_split('train', dataset_dir, file_pattern=file_pattern)
images, _, labels = load_batch(dataset, batch_size=batch_size)
num_batches_per_epoch = int(dataset.num_samples / batch_size)
num_steps_per_epoch = num_batches_per_epoch
decay_steps = int(num_epochs_before_decay * num_steps_per_epoch)
with slim.arg_scope(inception_resnet_v2_arg_scope()):
logits, end_points = inception_resnet_v2(images, num_classes=dataset.num_classes, is_training=True)
exclude = ['InceptionResnetV2/Logits', 'InceptionResnetV2/AuxLogits']
variables_to_restore = slim.get_variables_to_restore(exclude=exclude)
one_hot_labels = slim.one_hot_encoding(labels, dataset.num_classes)
loss = tf.losses.softmax_cross_entropy(onehot_labels=one_hot_labels, logits=logits)
total_loss = tf.losses.get_total_loss()
global_step = get_or_create_global_step()
lr = tf.train.exponential_decay(
learning_rate=initial_learning_rate,
global_step=global_step,
decay_steps=decay_steps,
decay_rate=learning_rate_decay_factor,
staircase=True)
optimizer = tf.train.AdamOptimizer(learning_rate=lr)
train_op = slim.learning.create_train_op(total_loss, optimizer)
predictions = tf.argmax(end_points['Predictions'], 1)
probabilities = end_points['Predictions']
accuracy, accuracy_update = tf.contrib.metrics.streaming_accuracy(predictions, labels)
metrics_op = tf.group(accuracy_update, probabilities)
tf.summary.scalar('losses/Total_Loss', total_loss)
tf.summary.scalar('accuracy', accuracy)
tf.summary.scalar('learning_rate', lr)
my_summary_op = tf.summary.merge_all()
def train_step(sess, train_op, global_step):
'''
Simply runs a session for the three arguments provided and gives a logging on the time elapsed for each global step
'''
start_time = time.time()
total_loss, global_step_count, _ = sess.run([train_op, global_step, metrics_op])
time_elapsed = time.time() - start_time
logging.info('global step %s: loss: %.4f (%.2f sec/step)', global_step_count, total_loss, time_elapsed)
return total_loss, global_step_count
saver = tf.train.Saver(variables_to_restore)
def restore_fn(sess):
return saver.restore(sess, checkpoint_file)
sv = tf.train.Supervisor(logdir=log_dir, summary_op=None, init_fn=restore_fn)
with sv.managed_session() as sess:
for step in xrange(num_steps_per_epoch * num_epochs):
if step % num_batches_per_epoch == 0:
logging.info('Epoch %s/%s', step / num_batches_per_epoch + 1, num_epochs)
learning_rate_value, accuracy_value = sess.run([lr, accuracy])
logging.info('Current Learning Rate: %s', learning_rate_value)
logging.info('Current Streaming Accuracy: %s', accuracy_value)
logits_value, probabilities_value, predictions_value, labels_value = sess.run(
[logits, probabilities, predictions, labels])
print 'logits: \n', logits_value
print 'Probabilities: \n', probabilities_value
print 'predictions: \n', predictions_value
print 'Labels:\n:', labels_value
if step % 10 == 0:
loss, _ = train_step(sess, train_op, sv.global_step)
summaries = sess.run(my_summary_op)
sv.summary_computed(sess, summaries)
else:
loss, _ = train_step(sess, train_op, sv.global_step)
logging.info('Final Loss: %s', loss)
logging.info('Final Accuracy: %s', sess.run(accuracy))
logging.info('Finished training! Saving model to disk now.')
sv.saver.save(sess, sv.save_path, global_step=sv.global_step)
</code></pre>
<p>This code seems to work an I have ran training on some sample data and
Im getting 94% accuracy</p>
<p>Evaluation code:</p>
<pre><code>log_dir = './log'
log_eval = './log_eval_test'
dataset_dir = './data'
batch_size = 10
num_epochs = 1
checkpoint_file = tf.train.latest_checkpoint('./')
def run():
if not os.path.exists(log_eval):
os.mkdir(log_eval)
with tf.Graph().as_default() as graph:
tf.logging.set_verbosity(tf.logging.INFO)
dataset = get_split('train', dataset_dir)
images, raw_images, labels = load_batch(dataset, batch_size=batch_size, is_training=False)
num_batches_per_epoch = dataset.num_samples / batch_size
num_steps_per_epoch = num_batches_per_epoch
with slim.arg_scope(inception_resnet_v2_arg_scope()):
logits, end_points = inception_resnet_v2(images, num_classes=dataset.num_classes, is_training=False)
variables_to_restore = slim.get_variables_to_restore()
saver = tf.train.Saver(variables_to_restore)
def restore_fn(sess):
return saver.restore(sess, checkpoint_file)
predictions = tf.argmax(end_points['Predictions'], 1)
accuracy, accuracy_update = tf.contrib.metrics.streaming_accuracy(predictions, labels)
metrics_op = tf.group(accuracy_update)
global_step = get_or_create_global_step()
global_step_op = tf.assign(global_step, global_step + 1)
def eval_step(sess, metrics_op, global_step):
'''
Simply takes in a session, runs the metrics op and some logging information.
'''
start_time = time.time()
_, global_step_count, accuracy_value = sess.run([metrics_op, global_step_op, accuracy])
time_elapsed = time.time() - start_time
logging.info('Global Step %s: Streaming Accuracy: %.4f (%.2f sec/step)', global_step_count, accuracy_value,
time_elapsed)
return accuracy_value
tf.summary.scalar('Validation_Accuracy', accuracy)
my_summary_op = tf.summary.merge_all()
sv = tf.train.Supervisor(logdir=log_eval, summary_op=None, saver=None, init_fn=restore_fn)
with sv.managed_session() as sess:
for step in xrange(num_steps_per_epoch * num_epochs):
sess.run(sv.global_step)
if step % num_batches_per_epoch == 0:
logging.info('Epoch: %s/%s', step / num_batches_per_epoch + 1, num_epochs)
logging.info('Current Streaming Accuracy: %.4f', sess.run(accuracy))
if step % 10 == 0:
eval_step(sess, metrics_op=metrics_op, global_step=sv.global_step)
summaries = sess.run(my_summary_op)
sv.summary_computed(sess, summaries)
else:
eval_step(sess, metrics_op=metrics_op, global_step=sv.global_step)
logging.info('Final Streaming Accuracy: %.4f', sess.run(accuracy))
raw_images, labels, predictions = sess.run([raw_images, labels, predictions])
for i in range(10):
image, label, prediction = raw_images[i], labels[i], predictions[i]
prediction_name, label_name = dataset.labels_to_name[prediction], dataset.labels_to_name[label]
text = 'Prediction: %s \n Ground Truth: %s' % (prediction_name, label_name)
img_plot = plt.imshow(image)
plt.title(text)
img_plot.axes.get_yaxis().set_ticks([])
img_plot.axes.get_xaxis().set_ticks([])
plt.show()
logging.info(
'Model evaluation has completed! Visit TensorBoard for more information regarding your evaluation.')
</code></pre>
<p>So after training the model and getting 94% accuracy i tried to
evaluate the model. On evaluation I get 0-1% accuracy the whole time. I
investigated this only to find that it is predicting the same class every
time</p>
<pre><code>labels: [7, 11, 5, 1, 20, 0, 18, 1, 0, 7]
predictions: [10, 10, 10, 10, 10, 10, 10, 10, 10, 10]
</code></pre>
<p>Can anyone help in where i may be going wrong?</p>
<p>EDIT:</p>
<p>TensorBoard accuracy and loss form training</p>
<p><a href="https://i.stack.imgur.com/NLiwC.png" rel="nofollow
noreferrer"><img src="https://i.stack.imgur.com/NLiwC.png" alt="enter
image description here"></a>
<a href="https://i.stack.imgur.com/QdX6d.png" rel="nofollow
noreferrer"><img src="https://i.stack.imgur.com/QdX6d.png" alt="enter
image description here"></a></p>
<p>TensorBoard accuracy from evaluation</p>
<p><a href="https://i.stack.imgur.com/TNE5B.png" rel="nofollow
noreferrer"><img src="https://i.stack.imgur.com/TNE5B.png" alt="enter
image description here"></a></p>
<p>EDIT:</p>
<p>Ive still not been able to solve this issues. I thought there might be
a problem with how I am restoring the graph in the eval script so I tried
using this to restore the model instead</p>
<pre><code>saver = tf.train.import_meta_graph('/log/model.ckpt.meta')
def restore_fn(sess):
return saver.restore(sess, checkpoint_file)
</code></pre>
<p>instead of</p>
<pre><code>variables_to_restore = slim.get_variables_to_restore()
saver = tf.train.Saver(variables_to_restore)
def restore_fn(sess):
return saver.restore(sess, checkpoint_file)
</code></pre>
<p>and just just takes a very long time to start and finally errors. I
then tried using V1 of the writer in the saver (<code>saver =
tf.train.Saver(variables_to_restore,
write_version=saver_pb2.SaveDef.V1)</code>) and retrained and was unable
to load this checkpoint at all as it said variables was missing.</p>
<p>I also attempted to run my eval script with the same data it trained on
just to see if this may give different results yet I get the same. </p>
<p>Finally I re-cloned the repo from the url and ran a train using the
same dataset in the tutorial and I get 0-3% accuracy when I evaluate even
after getting it to 84% whilst training. Also my checkpoints must have the
correct information as when I restart training the accuracy continues from
where it left of. It feels like i'm not doing something correctly when I
restore the model. Would really appreciate any suggestions on this as im
at a dead end currently :( </p>
- text: >
<p>I've just started using tensorflow for a project I'm working on. The
program aims to be a binary classifier with input being 12 features. The
output is either normal patient or patient with a disease. The prevalence
of the disease is quite low and so my dataset is very imbalanced, with 502
examples of normal controls and only 38 diseased patients. For this
reason, I'm trying to use
<code>tf.nn.weighted_cross_entropy_with_logits</code> as my cost
function.</p>
<p>The code is based on the iris custom estimator from the official
tensorflow documentation, and works with
<code>tf.losses.sparse_softmax_cross_entropy</code> as the cost function.
However, when I change to <code>weighted_cross_entropy_with_logits</code>,
I get a shape error and I'm not sure how to fix this.</p>
<pre><code>ValueError: logits and targets must have the same shape ((?, 2)
vs (?,))
</code></pre>
<p>I have searched and similar problems have been solved by just reshaping
the labels - I have tried to do this unsuccessfully (and don't understand
why <code>tf.losses.sparse_softmax_cross_entropy</code> works fine and the
weighted version does not). </p>
<p>My full code is here
<a
href="https://gist.github.com/revacious/83142573700c17b8d26a4a1b84b0dff7"
rel="nofollow
noreferrer">https://gist.github.com/revacious/83142573700c17b8d26a4a1b84b0dff7</a></p>
<p>Thanks!</p>
- text: >
<p>In the documentation it seems they focus on how to save and restore
tf.keras.models, but i was wondering how do you save and restore models
trained customly through some basic iteration loop?</p>
<p>Now that there isnt a graph or a session, how do we save structure
defined in a tf function that is customly built without using layer
abstractions?</p>
- text: >
<p>I simply have <code>train = optimizer.minimize(loss =
tf.constant(4,dtype="float32"))</code> Line of code that i change before
everything is working. <br/></p>
<p>Why it is giving error ? Because documentation say it can be tensor <a
href="https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/Adam#minimize"
rel="nofollow noreferrer">Here is Docs</a> </p>
<pre><code>W = tf.Variable([0.5],tf.float32)
b = tf.Variable([0.1],tf.float32)
x = tf.placeholder(tf.float32)
y= tf.placeholder(tf.float32)
discounted_reward = tf.placeholder(tf.float32,shape=[4,],
name="discounted_reward")
linear_model = W*x + b
squared_delta = tf.square(linear_model - y)
print(squared_delta)
loss = tf.reduce_sum(squared_delta*discounted_reward)
print(loss)
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss = tf.constant(4,dtype="float32"))
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
for i in range(3):
sess.run(train,{x:[1,2,3,4],y:[0,-1,-2,-3],discounted_reward:[1,2,3,4]})
print(sess.run([W,b]))
</code></pre>
<hr>
<p>I really need this thing to work. In this particular example we can
have other ways to solve it but i need it to work as my actual code can do
this only </p>
<p><hr/> Error is</p>
<pre><code>> ValueError: No gradients provided for any variable, check
your graph
> for ops that do not support gradients, between variables
> ["<tf.Variable 'Variable:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_1:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_2:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_3:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_4:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_5:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_6:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_7:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_8:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_9:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_10:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_11:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_12:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_13:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_14:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_15:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_16:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_17:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_18:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_19:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_20:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_21:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_22:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_23:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_24:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_25:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_26:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_27:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_28:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_29:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_30:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_31:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_32:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_33:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_34:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_35:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_36:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_37:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_38:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_39:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_40:0' shape=(1,) dtype=float32_ref>",
> "<tf.Variable 'Variable_41:0' shape=(1,) dtype=float32_ref>"]
and loss
> Tensor("Const_4:0", shape=(), dtype=float32).
</code></pre>
- text: >
<p>I found in the <a href="https://www.tensorflow.org/tutorials/recurrent"
rel="nofollow noreferrer">tensorflow doc</a>:</p>
<p><code>
stacked_lstm = tf.contrib.rnn.MultiRNNCell([lstm] * number_of_layers,
...
</code></p>
<p>I need to use MultiRNNCell</p>
<p>but, I write those lines</p>
<p><code>
a = [tf.nn.rnn_cell.BasicLSTMCell(10)]*3
print id(a[0]), id(a[1])
</code></p>
<p>Its output is <code>[4648063696 4648063696]</code>.</p>
<p>Can <code>MultiRNNCell</code> use the same object
<code>BasicLSTMCell</code> as a list for parameter?</p>
pipeline_tag: text-classification
inference: true
base_model: sentence-transformers/all-mpnet-base-v2
model-index:
- name: SetFit with sentence-transformers/all-mpnet-base-v2
results:
- task:
type: text-classification
name: Text Classification
dataset:
name: Unknown
type: unknown
split: test
metrics:
- type: accuracy
value: 0.85
name: Accuracy
- type: precision
value: 0.8535353535353536
name: Precision
- type: recall
value: 0.85
name: Recall
- type: f1
value: 0.8496240601503761
name: F1
The model has been trained using an efficient few-shot learning technique that involves:
Then you can load this model and run inference.