faster-whisper-server / docs /usage /speech-to-text.md
Fedir Zadniprovskyi
docs: add TODOs
6472666

TODO: add a note about automatic downloads TODO: mention streaming TODO: add a demo TODO: talk about audio format TODO: add a note about performance TODO: add a note about vad

!!! note

Before proceeding, make sure you are familiar with the [OpenAI Speech-to-Text](https://platform.openai.com/docs/guides/speech-to-text) and the relevant [OpenAI API reference](https://platform.openai.com/docs/api-reference/audio/createTranscription)

Curl

curl http://localhost:8000/v1/audio/transcriptions -F "[email protected]"

Python

=== "httpx"

```python
import httpx

with open('audio.wav', 'rb') as f:
    files = {'file': ('audio.wav', f)}
    response = httpx.post('http://localhost:8000/v1/audio/transcriptions', files=files)

print(response.text)
```

OpenAI SDKs

!!! note

Although this project doesn't require an API key, all OpenAI SDKs require an API key. Therefore, you will need to set it to a non-empty value. Additionally, you will need to overwrite the base URL to point to your server.

This can be done by setting the `OPENAI_API_KEY` and `OPENAI_BASE_URL` environment variables or by passing them as arguments to the SDK.

=== "Python"

```python
import httpx

with open('audio.wav', 'rb') as f:
    files = {'file': ('audio.wav', f)}
    response = httpx.post('http://localhost:8000/v1/audio/transcriptions', files=files)

print(response.text)
```

=== "CLI"

```bash
export OPENAI_BASE_URL=http://localhost:8000/v1/
export OPENAI_API_KEY="cant-be-empty"
openai api audio.transcriptions.create -m Systran/faster-whisper-small -f audio.wav --response-format text
```

=== "Other"

See [OpenAI libraries](https://platform.openai.com/docs/libraries).