Locutusque
/

gpt2-conversational-or-qa

Text Generation

text-generation-inference

Model card Files Files and versions

Locutusque commited on May 14, 2023

Commit

268f7cf

·

1 Parent(s): d2d5ea5

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -73,7 +73,7 @@ The model architecture used in this model is GPT-2, a transformer-based language
 The model is evaluated based on several metrics, including loss, reward, penalty, BLEU score, and perplexity. The loss metric is calculated during training and reflects the difference between the predicted output and the actual output. The reward metric is based on the number of correct words generated by the model, while the penalty metric penalizes the model for repeating words consecutively. The BLEU score measures the similarity between the generated text and the ground truth text, while the perplexity metric measures how well the model is able to predict the next word in a sequence. During validation, the model achieved the following metrics:
 - BLEU Score: 9
-- Perplexity: 41
 - Loss: 1.7
 Although these metrics seem mediocre, it's actually better because that way the model is able to make open-ended responses, but is still coherent to the user's input.

 The model is evaluated based on several metrics, including loss, reward, penalty, BLEU score, and perplexity. The loss metric is calculated during training and reflects the difference between the predicted output and the actual output. The reward metric is based on the number of correct words generated by the model, while the penalty metric penalizes the model for repeating words consecutively. The BLEU score measures the similarity between the generated text and the ground truth text, while the perplexity metric measures how well the model is able to predict the next word in a sequence. During validation, the model achieved the following metrics:
 - BLEU Score: 9
+- Perplexity: 22
 - Loss: 1.7
 Although these metrics seem mediocre, it's actually better because that way the model is able to make open-ended responses, but is still coherent to the user's input.