freds0
/

distil-whisper-large-v3-ptbr

Automatic Speech Recognition

Model card Files Files and versions Community

freds0 commited on Oct 16, 2024

Commit

aeb215c

•

1 Parent(s): 9d31bcd

Update README.md

Files changed (1) hide show

README.md +17 -4

README.md CHANGED Viewed

@@ -36,17 +36,30 @@ You can use the model with the Transformers library:
 from transformers import WhisperForConditionalGeneration, WhisperProcessor
 ```python
 processor = WhisperProcessor.from_pretrained("freds0/distil-whisper-large-v3-ptbr")
 model = WhisperForConditionalGeneration.from_pretrained("freds0/distil-whisper-large-v3-ptbr")
-# Load audio and process
-audio_input = ...  # your audio here
-input_features = processor(audio_input, sampling_rate=16000, return_tensors="pt").input_features
 # Generate transcription
 predicted_ids = model.generate(input_features)
 transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
-print(transcription[0])
 ```

 from transformers import WhisperForConditionalGeneration, WhisperProcessor
 ```python
+from datasets import load_dataset
+from transformers import WhisperProcessor, WhisperForConditionalGeneration
+# Load the validation split of the Common Voice dataset for Portuguese
+common_voice = load_dataset("mozilla-foundation/common_voice_11_0", "pt", split="validation")
+# Load the pretrained model and processor
 processor = WhisperProcessor.from_pretrained("freds0/distil-whisper-large-v3-ptbr")
 model = WhisperForConditionalGeneration.from_pretrained("freds0/distil-whisper-large-v3-ptbr")
+# Select a sample from the dataset
+sample = common_voice[0]  # You can change the index to select a different sample
+# Get the audio array and sampling rate
+audio_input = sample["audio"]["array"]
+sampling_rate = sample["audio"]["sampling_rate"]
+# Preprocess the audio
+input_features = processor(audio_input, sampling_rate=sampling_rate, return_tensors="pt").input_features
 # Generate transcription
 predicted_ids = model.generate(input_features)
 transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
+print("Transcription:", transcription[0])
 ```