orkidea
/

whisper-small-guc

Automatic Speech Recognition

Model card Files Files and versions Community

orkidea commited on Oct 12, 2023

Commit

9a670f7

·

1 Parent(s): c075bb2

Update README.md

Files changed (1) hide show

README.md +10 -0

README.md CHANGED Viewed

@@ -21,6 +21,16 @@ This model has been trained on a unique dataset derived from parsed audio and te
 This model represents an initial endeavor in the journey of developing transcription models specifically for indigenous languages. The creation and improvement of such models have profound societal implications. It not only helps in preserving and promoting indigenous languages but also serves as a valuable asset for linguistic studies, helping scholars and communities alike in understanding and promoting the rich cultural tapestry of indigenous languages.
 # Model Accuracy Warning
 While this model has shown promising results, it's essential to be aware of its limitations:

 This model represents an initial endeavor in the journey of developing transcription models specifically for indigenous languages. The creation and improvement of such models have profound societal implications. It not only helps in preserving and promoting indigenous languages but also serves as a valuable asset for linguistic studies, helping scholars and communities alike in understanding and promoting the rich cultural tapestry of indigenous languages.
+## Dataset Details
+The dataset consists of 1,835 audio recordings, each accompanied by its respective transcription. The lexical corpus encompasses approximately 3,000 unique words.
+- **Total Audio Duration**: 6241.65 seconds (approximately 1.7 hours)
+- **Average Audio Duration**: 3.41 seconds
+This collection of data serves as a foundational resource for understanding and processing the Wayuunaiki language.
 # Model Accuracy Warning
 While this model has shown promising results, it's essential to be aware of its limitations: