YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Faster Whisper Transcription Service

Overview

This project uses the faster_whisper Python package to provide an API endpoint for audio transcription. It utilizes OpenAI's Whisper model (large-v3) for accurate and efficient speech-to-text conversion. The service is designed to be deployed on Hugging Face endpoints.

Features

  • Efficient Transcription: Utilizes the large-v3 Whisper model for high-quality transcription.
  • Multilingual Support: Supports transcription in various languages, with default language set to German (de).
  • Segmented Output: Returns transcribed text with segment IDs and timestamps for each transcribed segment.

Usage

import requests
import os

# Sample data dict with the link to the video file and the desired language for transcription
DATA = {
    "inputs": "<base64_encoded_audio>",
    "language": "de",
    "task": "transcribe"
}

HF_ACCESS_TOKEN = os.environ.get("HF_TRANSCRIPTION_ACCESS_TOKEN")
API_URL = os.environ.get("HF_TRANSCRIPTION_ENDPOINT")

HEADERS = {
    "Authorization": HF_ACCESS_TOKEN,
    "Content-Type": "application/json"
}

response = requests.post(API_URL, headers=HEADERS, json=DATA)
print(response)

Logging

Logging is set up to debug level, providing detailed information during the transcription process, including the length of decoded bytes, the progress of segments being transcribed, and a confirmation once the inference is completed.

Deployment

This service is intended for deployment on Hugging Face endpoints. Ensure you follow Hugging Face's guidelines for deploying model endpoints.

Downloads last month
64
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support