tennessejoyce commited on
Commit
d78f7d1
·
1 Parent(s): 8e4c5a0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -14
README.md CHANGED
@@ -4,17 +4,19 @@ license: cc-by-4.0
4
  widget:
5
  - text: "[Gmail API] How can I extract plain text from an email sent to me?"
6
  ---
 
7
  # Titlewave: bert-base-uncased
8
 
9
  ## Model description
10
- Titlewave is a Chrome extension that helps you choose better titles for your Stack Overflow questions. See https://github.com/tennessejoyce/TitleWave for more information.
11
- This is one of two models used in the Titlewave project, to classify whether question will be answered or not just based on the title. The other model (https://huggingface.co/tennessejoyce/titlewave-t5-small) suggests a title based on on the body of the question.
 
12
 
13
  ## Intended use
14
 
15
- Try out different titles for your Stack Overflow post, and see which one gives you the best chance of recieving an answer.
16
- This model can be used in your browser as a Chrome extension by following the installation instructions at https://github.com/tennessejoyce/TitleWave.
17
- Or load it in Python as follows:
18
 
19
  ```python
20
  >>> from transformers import pipeline
@@ -28,21 +30,21 @@ The 'score' in the output represents the probability of getting an answer with t
28
 
29
  ## Training data
30
 
31
- The weights were initialized from the BERT base model (https://huggingface.co/bert-base-uncased), which was trained on BookCorpus and English Wikipedia.
32
- Then the model was fine-tuned on the dataset of previous Stack Overflow post titles (https://archive.org/details/stackexchange).
33
- Specifically I used three years of posts from 2017-2019, filtered out posts which were closed, and selected 5% of the remaining posts at random to use in
34
- the training set, and the same amount for validation and test sets (278,155 posts each).
35
 
36
  ## Training procedure
37
 
38
- The model was fine-tuned for two epochs with a batch size of 32 (17384 steps total) using 16-bit mixed precision.
39
- After some hyperparameter tuning, I found that the following two-phase training procedure yielded the best performance (ROC-AUC) on the validation set:
40
  * In the first epoch, all layers were frozen except for the last two (pooling layer and classification layer) and a learning rate of 3e-4 was used.
41
  * In the second epoch all layers were unfrozen, and the learning rate was decreased by a factor of 10 to 3e-5.
42
- Otherwise, all parameters we set to the defaults listed at https://huggingface.co/transformers/main_classes/trainer.html#transformers.TrainingArguments,
43
- such as AdamW optimizer, and a linearly decreasing learning schedule (both of which are reset between epochs).
 
44
 
45
  ## Evaluation
46
 
47
- See https://github.com/tennessejoyce/TitleWave/blob/master/model_training/test_classifier.ipynb for my analysis of the performance of the title classifition model on the test set.
48
 
 
4
  widget:
5
  - text: "[Gmail API] How can I extract plain text from an email sent to me?"
6
  ---
7
+
8
  # Titlewave: bert-base-uncased
9
 
10
  ## Model description
11
+
12
+ Titlewave is a Chrome extension that helps you choose better titles for your Stack Overflow questions. See the [github repository](https://github.com/tennessejoyce/TitleWave) for more information.
13
+ This is one of two NLP models used in the Titlewave project, and it's purpose is to classify whether question will be answered or not just based on the title. The [companion model](https://huggingface.co/tennessejoyce/titlewave-t5-small) suggests a new title based on on the body of the question.
14
 
15
  ## Intended use
16
 
17
+ Try out different titles for your Stack Overflow post, and see which one gives you the best chance of receiving an answer.
18
+ This model can be used in your browser as a Chrome extension by following the installation instructions on the [github repository](https://github.com/tennessejoyce/TitleWave).
19
+ Or load it in Python like this (which automatically downloads the model to your machine):
20
 
21
  ```python
22
  >>> from transformers import pipeline
 
30
 
31
  ## Training data
32
 
33
+ The weights were initialized from the [BERT base model](https://huggingface.co/bert-base-uncased), which was trained on BookCorpus and English Wikipedia.
34
+ Then the model was fine-tuned on the dataset of previous Stack Overflow post titles, which is publicly available [here](https://archive.org/details/stackexchange).
35
+ Specifically I used three years of posts from 2017-2019, filtered out posts which were closed (e.g., duplicates, off-topic), and selected 5% of the remaining posts at random to use in the training set, and the same amount for validation and test sets (278,155 posts each).
 
36
 
37
  ## Training procedure
38
 
39
+ The model was fine-tuned for two epochs with a batch size of 32 (17,384 steps total) using 16-bit mixed precision.
40
+ After some hyperparameter tuning, I found that the following two-phase training procedure yielded the best performance (ROC-AUC score) on the validation set:
41
  * In the first epoch, all layers were frozen except for the last two (pooling layer and classification layer) and a learning rate of 3e-4 was used.
42
  * In the second epoch all layers were unfrozen, and the learning rate was decreased by a factor of 10 to 3e-5.
43
+
44
+ Otherwise, all parameters we set to the defaults listed [here](https://huggingface.co/transformers/main_classes/trainer.html#transformers.TrainingArguments),
45
+ including the AdamW optimizer and a linearly decreasing learning schedule (both of which were reset between the two epochs).
46
 
47
  ## Evaluation
48
 
49
+ See [this notebook](https://github.com/tennessejoyce/TitleWave/blob/master/model_training/test_classifier.ipynb) for the performance of the title classification model on the test set.
50