tennessejoyce
commited on
Commit
·
b8a1c44
1
Parent(s):
d78f7d1
Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ widget:
|
|
10 |
## Model description
|
11 |
|
12 |
Titlewave is a Chrome extension that helps you choose better titles for your Stack Overflow questions. See the [github repository](https://github.com/tennessejoyce/TitleWave) for more information.
|
13 |
-
This is one of two NLP models used in the Titlewave project, and
|
14 |
|
15 |
## Intended use
|
16 |
|
@@ -37,12 +37,12 @@ Specifically I used three years of posts from 2017-2019, filtered out posts whic
|
|
37 |
## Training procedure
|
38 |
|
39 |
The model was fine-tuned for two epochs with a batch size of 32 (17,384 steps total) using 16-bit mixed precision.
|
40 |
-
After some hyperparameter tuning, I found that the following two-phase training procedure
|
41 |
* In the first epoch, all layers were frozen except for the last two (pooling layer and classification layer) and a learning rate of 3e-4 was used.
|
42 |
* In the second epoch all layers were unfrozen, and the learning rate was decreased by a factor of 10 to 3e-5.
|
43 |
|
44 |
Otherwise, all parameters we set to the defaults listed [here](https://huggingface.co/transformers/main_classes/trainer.html#transformers.TrainingArguments),
|
45 |
-
including the AdamW optimizer and a linearly decreasing learning schedule (both of which were reset between the two epochs).
|
46 |
|
47 |
## Evaluation
|
48 |
|
|
|
10 |
## Model description
|
11 |
|
12 |
Titlewave is a Chrome extension that helps you choose better titles for your Stack Overflow questions. See the [github repository](https://github.com/tennessejoyce/TitleWave) for more information.
|
13 |
+
This is one of two NLP models used in the Titlewave project, and its purpose is to classify whether question will be answered or not just based on the title. The [companion model](https://huggingface.co/tennessejoyce/titlewave-t5-small) suggests a new title based on on the body of the question.
|
14 |
|
15 |
## Intended use
|
16 |
|
|
|
37 |
## Training procedure
|
38 |
|
39 |
The model was fine-tuned for two epochs with a batch size of 32 (17,384 steps total) using 16-bit mixed precision.
|
40 |
+
After some hyperparameter tuning, I found that the following two-phase training procedure yields the best performance (ROC-AUC score) on the validation set:
|
41 |
* In the first epoch, all layers were frozen except for the last two (pooling layer and classification layer) and a learning rate of 3e-4 was used.
|
42 |
* In the second epoch all layers were unfrozen, and the learning rate was decreased by a factor of 10 to 3e-5.
|
43 |
|
44 |
Otherwise, all parameters we set to the defaults listed [here](https://huggingface.co/transformers/main_classes/trainer.html#transformers.TrainingArguments),
|
45 |
+
including the AdamW optimizer and a linearly decreasing learning schedule (both of which were reset between the two epochs). See the [github repository](https://github.com/tennessejoyce/TitleWave) for the scripts that we used to train the model.
|
46 |
|
47 |
## Evaluation
|
48 |
|