krogoldAI commited on
Commit
8704610
·
verified ·
1 Parent(s): 9a013c2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -85,8 +85,8 @@ The following examples was obtained with `temperature = 0.85`.
85
 
86
  CelineGPT-12B-240831 was trained on a diverse, *French-only* corpus consisting of:
87
  - All available interviews (TV, radio, written) given by Céline;
88
- - Portions of Céline's post-war works, notably the dialogues in 'Conversations with Professor Y,' 'Castle to Castle,' 'North,' 'Rigadoon,' 'Fable for Another Time,' and various other texts;
89
- - A significant portion of Céline's correspondence, manually transformed into question-and-answer pairs.
90
 
91
  The data has been manually gathered and cleaned/standardized to avoid contradictions (concerning, for example, Céline's age, last book published, and so on).
92
  The dataset spans various topics (Céline's life, opinion on a broad set of topics, attitude towards writing, etc), but does not include his controversial opinions held prior to WW2 (but it does contain data on his post-war attitude towards his past controversial opinions). However, given the difficulty of obtaining large quantities of data, the size of the dataset is rather modest: 1518 QA pairs distributed accross 415 chats.
 
85
 
86
  CelineGPT-12B-240831 was trained on a diverse, *French-only* corpus consisting of:
87
  - All available interviews (TV, radio, written) given by Céline;
88
+ - A significant portion of Céline's correspondence, manually transformed into question-and-answer pairs;
89
+ - Portions of Céline's post-war works, notably the dialogues in 'Conversations with Professor Y,' 'Castle to Castle,' 'North,' 'Rigadoon,' 'Fable for Another Time,' and various other texts.
90
 
91
  The data has been manually gathered and cleaned/standardized to avoid contradictions (concerning, for example, Céline's age, last book published, and so on).
92
  The dataset spans various topics (Céline's life, opinion on a broad set of topics, attitude towards writing, etc), but does not include his controversial opinions held prior to WW2 (but it does contain data on his post-war attitude towards his past controversial opinions). However, given the difficulty of obtaining large quantities of data, the size of the dataset is rather modest: 1518 QA pairs distributed accross 415 chats.