Update README.md
Browse files
README.md
CHANGED
@@ -23,16 +23,36 @@ inference:
|
|
23 |
|
24 |
|
25 |
---
|
|
|
|
|
|
|
|
|
26 |
|
27 |
-
|
|
|
28 |
|
29 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
30 |
|
31 |
- `max_length`: The maximum length of the generated text, set to 400 tokens.
|
32 |
- `num_beams`: The number of beams for beam search, set to 10. A higher value will increase the diversity of the generated text but may also increase the inference time.
|
33 |
- `early_stopping`: If set to True, the generation will stop as soon as the end-of-sequence token is generated.
|
34 |
-
- `temperature`: The sampling temperature, set to 0.3.
|
35 |
- `no_repeat_ngram_size`: The size of the n-gram window to avoid repetitions, set to 2.
|
36 |
|
37 |
|
38 |
-
|
|
|
|
|
|
|
|
|
|
|
|
23 |
|
24 |
|
25 |
---
|
26 |
+
Introduction:
|
27 |
+
This repository contains a finetuned DistilChatGPT2 model for generating diverse essays on topics spanning Arts, Science, and Culture.
|
28 |
+
|
29 |
+
The model has been trained on a dataset of over 2000 high-quality essays written by human experts, covering a wide range of opinions and knowledge.
|
30 |
|
31 |
+
Dataset:
|
32 |
+
The training dataset comprises 2000+ essays covering diverse topics in Arts, Science, and Culture. These essays are written by human experts and contain a diverse set of opinions and knowledge, ensuring that the model learns from high-quality and diverse content.
|
33 |
|
34 |
+
Model Training:
|
35 |
+
- epoch: 50
|
36 |
+
- training_loss: 2.473200
|
37 |
+
- validation_loss: 4.569556
|
38 |
+
- perplexities: [517.4149169921875, 924.535888671875, 704.73291015625, 465.9677429199219, 577.629150390625, 443.994140625, 770.1861572265625, 683.028076171875, 1017.7510375976562, 880.795166015625]
|
39 |
+
- mean_perplexity: 698.603519
|
40 |
+
|
41 |
+
Description:
|
42 |
+
The model achieved a mean perplexity of 698.603519 on the validation set, indicating its ability to generate diverse and high-quality essays on the given topics.
|
43 |
+
|
44 |
+
During Text Generation, the following parameters are used:
|
45 |
|
46 |
- `max_length`: The maximum length of the generated text, set to 400 tokens.
|
47 |
- `num_beams`: The number of beams for beam search, set to 10. A higher value will increase the diversity of the generated text but may also increase the inference time.
|
48 |
- `early_stopping`: If set to True, the generation will stop as soon as the end-of-sequence token is generated.
|
49 |
+
- `temperature`: The sampling temperature, is set to 0.3.
|
50 |
- `no_repeat_ngram_size`: The size of the n-gram window to avoid repetitions, set to 2.
|
51 |
|
52 |
|
53 |
+

|
54 |
+
|
55 |
+
|
56 |
+
Find the kaggle notebook for this project at
|
57 |
+
|
58 |
+
[Kaggle Notebook](https://www.kaggle.com/code/vignesharjunraj/finetuned-distilgpt2-llm-for-essays-400-words/)
|