File size: 6,579 Bytes

06c8500

URL: https://docs.pipecat.ai/server/utilities/audio/audio-buffer-processor#audio-processing-features
Title: AudioBufferProcessor - Pipecat
==================================================

AudioBufferProcessor - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing AudioBufferProcessor Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline  Overview The AudioBufferProcessor captures and buffers audio frames from both input (user) and output (bot) sources during conversations. It provides synchronized audio streams with configurable sample rates, supports both mono and stereo output, and offers flexible event handlers for various audio processing workflows.  Constructor Copy Ask AI AudioBufferProcessor( sample_rate = None , num_channels = 1 , buffer_size = 0 , enable_turn_audio = False , ** kwargs )  Parameters  sample_rate Optional[int] default: "None" The desired output sample rate in Hz. If None , uses the transport’s sample rate from the StartFrame .  num_channels int default: "1" Number of output audio channels: 1 : Mono output (user and bot audio are mixed together) 2 : Stereo output (user audio on left channel, bot audio on right channel)  buffer_size int default: "0" Buffer size in bytes that triggers audio data events: 0 : Events only trigger when recording stops >0 : Events trigger whenever buffer reaches this size (useful for chunked processing)  enable_turn_audio bool default: "False" Whether to enable per-turn audio event handlers ( on_user_turn_audio_data and on_bot_turn_audio_data ).  Properties  sample_rate Copy Ask AI @ property def sample_rate ( self ) -> int The current sample rate of the audio processor in Hz.  num_channels Copy Ask AI @ property def num_channels ( self ) -> int The number of channels in the audio output (1 for mono, 2 for stereo).  Methods  start_recording() Copy Ask AI async def start_recording () Start recording audio from both user and bot sources. Initializes recording state and resets audio buffers.  stop_recording() Copy Ask AI async def stop_recording () Stop recording and trigger final audio data handlers with any remaining buffered audio.  has_audio() Copy Ask AI def has_audio () -> bool Check if both user and bot audio buffers contain data. Returns: True if both buffers contain audio data.  Event Handlers The processor supports multiple event handlers for different audio processing workflows. Register handlers using the @processor.event_handler() decorator.  on_audio_data Triggered when buffer_size is reached or recording stops, providing merged audio. Copy Ask AI @audiobuffer.event_handler ( "on_audio_data" ) async def on_audio_data ( buffer , audio : bytes , sample_rate : int , num_channels : int ): # Handle merged audio data pass Parameters: buffer : The AudioBufferProcessor instance audio : Merged audio data (format depends on num_channels setting) sample_rate : Sample rate in Hz num_channels : Number of channels (1 or 2)  on_track_audio_data Triggered alongside on_audio_data , providing separate user and bot audio tracks. Copy Ask AI @audiobuffer.event_handler ( "on_track_audio_data" ) async def on_track_audio_data ( buffer , user_audio : bytes , bot_audio : bytes , sample_rate : int , num_channels : int ): # Handle separate audio tracks pass Parameters: buffer : The AudioBufferProcessor instance user_audio : Raw user audio bytes (always mono) bot_audio : Raw bot audio bytes (always mono) sample_rate : Sample rate in Hz num_channels : Always 1 for individual tracks  on_user_turn_audio_data Triggered when a user speaking turn ends. Requires enable_turn_audio=True . Copy Ask AI @audiobuffer.event_handler ( "on_user_turn_audio_data" ) async def on_user_turn_audio_data ( buffer , audio : bytes , sample_rate : int , num_channels : int ): # Handle user turn audio pass Parameters: buffer : The AudioBufferProcessor instance audio : Audio data from the user’s speaking turn sample_rate : Sample rate in Hz num_channels : Always 1 (mono)  on_bot_turn_audio_data Triggered when a bot speaking turn ends. Requires enable_turn_audio=True . Copy Ask AI @audiobuffer.event_handler ( "on_bot_turn_audio_data" ) async def on_bot_turn_audio_data ( buffer , audio : bytes , sample_rate : int , num_channels : int ): # Handle bot turn audio pass Parameters: buffer : The AudioBufferProcessor instance audio : Audio data from the bot’s speaking turn sample_rate : Sample rate in Hz num_channels : Always 1 (mono)  Audio Processing Features Automatic resampling : Converts incoming audio to the specified sample rate Buffer synchronization : Aligns user and bot audio streams temporally Silence insertion : Fills gaps in non-continuous audio streams to maintain timing Turn tracking : Monitors speaking turns when enable_turn_audio=True  Integration Notes  STT Audio Passthrough If using an STT service in your pipeline, enable audio passthrough to make audio available to the AudioBufferProcessor: Copy Ask AI stt = DeepgramSTTService( api_key = os.getenv( "DEEPGRAM_API_KEY" ), audio_passthrough = True , ) audio_passthrough is enabled by default.  Pipeline Placement Add the AudioBufferProcessor after transport.output() to capture both user and bot audio: Copy Ask AI pipeline = Pipeline([ transport.input(), # ... other processors ... transport.output(), audiobuffer, # Place after audio output # ... remaining processors ... ]) UserIdleProcessor KoalaFilter On this page Overview Constructor Parameters Properties sample_rate num_channels Methods start_recording() stop_recording() has_audio() Event Handlers on_audio_data on_track_audio_data on_user_turn_audio_data on_bot_turn_audio_data Audio Processing Features Integration Notes STT Audio Passthrough Pipeline Placement Assistant Responses are generated using AI and may contain mistakes.