metadata

language: en
tags:
  - transcription
  - T5
  - huggingface
license: apache-2.0
datasets: custom
model_type: t5

T5-based Audio Transcription Fusion Model

This model combines transcriptions from multiple sources to generate an optimal transcription. It is fine-tuned on a dataset where each sample has three candidate transcriptions and a reference transcription.

Training Details

Model trained on 9000 samples for 10 epochs with T5-small as the base model.

Training Loss: 0.01410862896591425

Evaluation Details

Test Loss: 0.009594996869187836 Word Error Rate (WER): 0.08080686073197246