javicorvi's picture
Update README.md
e64906a verified
metadata
license: apache-2.0
base_model: microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext
tags:
  - generated_from_trainer
metrics:
  - precision
  - recall
  - accuracy
  - f1
model-index:
  - name: pretoxtm-sentence-classifier
    results: []
datasets:
  - javicorvi/pretoxtm-dataset
language:
  - en
pipeline_tag: text-classification

pretoxtm-sentence-classifier

This model is a fine-tuned version of microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext on javicorvi/pretoxtm-dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1181
  • Precision: 0.9788
  • Recall: 0.9800
  • Accuracy: 0.9795
  • F1: 0.9794

Model description

PretoxTM Sentence Classifier is a model trained on preclinical toxicology literature, designed to detect sentences that contain treatment-related findings.

Training and evaluation data

The model was trained on javicorvi/pretoxtm-dataset.

The dataset is divided in train, validation and test.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1.1848183151867784e-05
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 1
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Precision Recall Accuracy F1
0.2543 1.0 514 0.1181 0.9788 0.9800 0.9795 0.9794
0.1344 2.0 1028 0.1488 0.9767 0.9775 0.9773 0.9771
0.0419 3.0 1542 0.1520 0.9767 0.9775 0.9773 0.9771

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2