ierhon
/

neural-chatbot

Text Generation

Model card Files Files and versions Community

ierhon commited on Jul 30, 2023

Commit

636327e

·

1 Parent(s): c323e5c

Update the model

Files changed (1) hide show

train.py +3 -4

train.py CHANGED Viewed

@@ -28,7 +28,7 @@ model.add(Dropout(0.5)) # dropout makes ___ task harder __ removing ____ informa
 model.add(Dense(512, activation="relu"))
 model.add(Dense(512, activation="relu"))
 model.add(Dense(256, activation="relu"))
-model.add(Dense(dset_size, activation="linear")) # TBH it doesn't matter that much what activation function to use IN THIS CASE, IN THIS LINE (in others it might be a really big deal), just linear does nothing at all to the output, that might be something like softmax but i'll test that later
 X = [] # we're loading the training data into input X
 y = [] # and output y
@@ -43,11 +43,10 @@ for key in dset:
 X = np.array(X) # normal lists are way slower than numpy arrays (remember, a list and an array is not the same thing, an array is far more limited)
 y = np.array(y) # that's why keras supports only numpy arrays ^
-model.compile(optimizer=Adam(), loss="mse", metrics=["accuracy",]) # kind of like settings for the training
-# TODO: change the loss
 model.fit(X, y, epochs=10, batch_size=8) # training the model, epochs means how many times does it have to read the data, batch_size is an optimization to train on multiple messages at the same time. Loss and accuracy are the opposite things, loss is how far the output is from a correct one, from 1 to 0, and accuracy how often does the model get the answer right, from 0 to 1.
-# Use   workers=4, use_multiprocessing=True)   if you don't have a GPU
 model.summary() # just for you to see info about the model, useful because you can check the parameter count

 model.add(Dense(512, activation="relu"))
 model.add(Dense(512, activation="relu"))
 model.add(Dense(256, activation="relu"))
+model.add(Dense(dset_size, activation="softmax")) # softmax is made for output, if the output should have only 1 neuron active, that means only one positive number is allowed and other are zeros
 X = [] # we're loading the training data into input X
 y = [] # and output y
 X = np.array(X) # normal lists are way slower than numpy arrays (remember, a list and an array is not the same thing, an array is far more limited)
 y = np.array(y) # that's why keras supports only numpy arrays ^
+model.compile(optimizer=Adam(), loss="categorical_crossentropy", metrics=["accuracy",]) # settings for the training, loss means the way to calculate loss - categorical crossentropy
 model.fit(X, y, epochs=10, batch_size=8) # training the model, epochs means how many times does it have to read the data, batch_size is an optimization to train on multiple messages at the same time. Loss and accuracy are the opposite things, loss is how far the output is from a correct one, from 1 to 0, and accuracy how often does the model get the answer right, from 0 to 1.
+# Add   , workers=4, use_multiprocessing=True)   if you don't have a GPU
 model.summary() # just for you to see info about the model, useful because you can check the parameter count