orkidea commited on
Commit
9a670f7
·
1 Parent(s): c075bb2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -0
README.md CHANGED
@@ -21,6 +21,16 @@ This model has been trained on a unique dataset derived from parsed audio and te
21
 
22
  This model represents an initial endeavor in the journey of developing transcription models specifically for indigenous languages. The creation and improvement of such models have profound societal implications. It not only helps in preserving and promoting indigenous languages but also serves as a valuable asset for linguistic studies, helping scholars and communities alike in understanding and promoting the rich cultural tapestry of indigenous languages.
23
 
 
 
 
 
 
 
 
 
 
 
24
  # Model Accuracy Warning
25
 
26
  While this model has shown promising results, it's essential to be aware of its limitations:
 
21
 
22
  This model represents an initial endeavor in the journey of developing transcription models specifically for indigenous languages. The creation and improvement of such models have profound societal implications. It not only helps in preserving and promoting indigenous languages but also serves as a valuable asset for linguistic studies, helping scholars and communities alike in understanding and promoting the rich cultural tapestry of indigenous languages.
23
 
24
+ ## Dataset Details
25
+
26
+ The dataset consists of 1,835 audio recordings, each accompanied by its respective transcription. The lexical corpus encompasses approximately 3,000 unique words.
27
+
28
+ - **Total Audio Duration**: 6241.65 seconds (approximately 1.7 hours)
29
+ - **Average Audio Duration**: 3.41 seconds
30
+
31
+ This collection of data serves as a foundational resource for understanding and processing the Wayuunaiki language.
32
+
33
+
34
  # Model Accuracy Warning
35
 
36
  While this model has shown promising results, it's essential to be aware of its limitations: