bert-drug-review-to-condition
This model is a fine-tuned version of bert-base-uncased on this dataset: Kallumadi,Surya and Grer,Felix. (2018). Drug Reviews (Drugs.com). UCI Machine Learning Repository. https://doi.org/10.24432/C5SK5S. It achieves the following results on the evaluation set:
- Loss: 0.6678
- Accuracy: 0.8376
- Precision: 0.8325
- Recall: 0.8376
- F1: 0.8317
Model description
"bert-base-uncased" fine-tuned for text-classification (multiclass): from input text, the model outputs the most likely medical pathology of the person. Training based on predicting 'condition' feature from 'review' feature (i.e., the person reviews the drugs they are taking for their condition)
Intended uses & limitations
Personal project
Training and evaluation data
The 100 most frequent conditions of the dataset are selected: {0: 'multiple sclerosis', 1: 'overactive bladde', 2: 'hyperhidrosis', 3: 'ibromyalgia', 4: 'menstrual disorders', 5: 'hypogonadism, male', 6: 'rosacea', 7: 'muscle spasm', 8: 'high blood pressure', 9: 'epilepsy', 10: 'psoriatic arthritis', 11: 'post traumatic stress disorde', 12: 'smoking cessation', 13: 'not listed / othe', 14: 'herpes simplex', 15: 'opiate dependence', 16: 'social anxiety disorde', 17: 'urticaria', 18: 'allergic rhinitis', 19: 'polycystic ovary syndrome', 20: 'obsessive compulsive disorde', 21: 'depression', 22: 'migraine prevention', 23: 'neuropathic pain', 24: 'ankylosing spondylitis', 25: 'skin or soft tissue infection', 26: 'constipation, drug induced', 27: 'obesity', 28: 'vaginal yeast infection', 29: 'osteoarthritis', 30: 'restless legs syndrome', 31: 'plaque psoriasis', 32: 'panic disorde', 33: 'abnormal uterine bleeding', 34: 'adhd', 35: 'high cholesterol', 36: 'diabetes, type 2', 37: 'anxiety and stress', 38: 'asthma, maintenance', 39: 'pneumonia', 40: 'schizophrenia', 41: 'opiate withdrawal', 42: 'osteoporosis', 43: 'influenza', 44: 'weight loss', 45: 'cough and nasal congestion', 46: 'birth control', 47: 'benign prostatic hyperplasia', 48: 'helicobacter pylori infection', 49: 'anxiety', 50: 'bronchitis', 51: 'rheumatoid arthritis', 52: 'narcolepsy', 53: 'generalized anxiety disorde', 54: 'insomnia', 55: 'nasal congestion', 56: 'major depressive disorde', 57: 'schizoaffective disorde', 58: 'psoriasis', 59: 'premenstrual dysphoric disorde', 60: 'bacterial vaginitis', 61: 'motion sickness', 62: 'erectile dysfunction', 63: 'constipation, chronic', 64: 'copd, maintenance', 65: 'back pain', 66: 'alcohol dependence', 67: 'migraine', 68: 'bladder infection', 69: 'underactive thyroid', 70: 'ulcerative colitis', 71: 'chronic pain', 72: 'hiv infection', 73: 'cold sores', 74: 'breast cance', 75: 'bipolar disorde', 76: 'irritable bowel syndrome', 77: 'anesthesia', 78: 'onychomycosis, toenail', 79: 'chlamydia infection', 80: 'gerd', 81: 'endometriosis', 82: 'seizures', 83: 'alcohol withdrawal', 84: 'bowel preparation', 85: 'hot flashes', 86: 'bacterial infection', 87: 'inflammatory conditions', 88: 'constipation', 89: 'headache', 90: 'urinary tract infection', 91: 'sinusitis', 92: 'emergency contraception', 93: 'cough', 94: 'acne', 95: 'atrial fibrillation', 96: 'pain', 97: 'nausea/vomiting', 98: 'hepatitis c', 99: 'postmenopausal symptoms'} The 'review' feature is lowercased and are only selected examples with more than 16 characters.
Training procedure
See code available at: https://github.com/mlafuentem/Marcuswas-bert-drug-review-to-condition/blob/main/Exercise_classification_conditions_code.ipynb
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3.0
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 |
---|---|---|---|---|---|---|---|
0.8469 | 1.0 | 13390 | 0.8275 | 0.7673 | 0.7686 | 0.7673 | 0.7551 |
0.6319 | 2.0 | 26780 | 0.6895 | 0.8094 | 0.8090 | 0.8094 | 0.7978 |
0.4116 | 3.0 | 40170 | 0.6678 | 0.8376 | 0.8325 | 0.8376 | 0.8317 |
Framework versions
- Transformers 4.40.0
- Pytorch 2.2.1+cu121
- Datasets 2.19.0
- Tokenizers 0.19.1
- Downloads last month
- 16
Model tree for Marcuswas/bert-drug-review-to-condition
Base model
google-bert/bert-base-uncased