Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

	@@ -15,3 +15,20 @@ The limitations of this model are that it can only generate text in the style of
15
16	I created my own dataset to train this model. I chose 14 novels written by H G Wells for my dataset. Most of the novels in the dataset are of the genre science fiction. The dataset contains more than 1 million tokens.
17

 I created my own dataset to train this model. I chose 14 novels written by H G Wells for my dataset. Most of the novels in the dataset are of the genre science fiction. The dataset contains more than 1 million tokens.
+The texts included in the corpus are novels written by H G Wells. The novels in the corpus are:
+The Time Machine
+In the Days of the Comet
+The Food of the Gods
+Tales of Space and Time
+The World Set Free
+The War of the Worlds
+The First Men in the Moon
+The Invisible Man
+The Island of Doctor Moreau
+The Sleeper Awakes
+The War in the Air
+The Research Magnificient
+The Udying Fire
+The Red Room
+The total number of tokens in the corpus is 1043588