dtransposed's picture
Update README.md
7d6c885 verified
metadata
base_model:
  - openai/whisper-small

Note: This classifier also contains fine-tuned whisper-small weights in its state dict. It will be properly loaded by my model wrapper.

Result of the classifier Rob's human-annotated dataset (data/voicemail_human_eval.csv):

Results for chunk size 1 seconds:

  • Accuracy: 0.8080
  • Precision: 0.9353
  • Recall: 0.7692
  • F1 Score: 0.8442

Results for chunk size 2 seconds:

  • Accuracy: 0.8560
  • Precision: 0.9650
  • Recall: 0.8166
  • F1 Score: 0.8846

Results for chunk size 5 seconds:

  • Accuracy: 0.8640
  • Precision: 0.9856
  • Recall: 0.8107
  • F1 Score: 0.8896

Results for chunk size 10 seconds:

  • Accuracy: 0.8760
  • Precision: 1.0000
  • Recall: 0.8166
  • F1 Score: 0.8990

Results for full audio samples:

  • Accuracy: 0.8760
  • Precision: 1.0000
  • Recall: 0.8166
  • F1 Score: 0.8990