diff --git a/_sources_indexrsttxt_cf07c1c2.txt b/_sources_indexrsttxt_cf07c1c2.txt new file mode 100644 index 0000000000000000000000000000000000000000..bd34bd05c534bf555aa8e130d07ad9a2a9cbd475 --- /dev/null +++ b/_sources_indexrsttxt_cf07c1c2.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/links/_sources/index.rst.txt#what-you-can-build +Title: Overview - Pipecat +================================================== + +Overview - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Get Started Overview Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Get Started Overview Installation & Setup Quickstart Core Concepts Next Steps & Examples Pipecat is an open source Python framework that handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions. “Multimodal” means you can use any combination of audio, video, images, and/or text in your interactions. And “real-time” means that things are happening quickly enough that it feels conversational—a “back-and-forth” with a bot, not submitting a query and waiting for results. ​ What You Can Build Voice Assistants Natural, real-time conversations with AI using speech recognition and synthesis Interactive Agents Personal coaches and meeting assistants that can understand context and provide guidance Multimodal Apps Applications that combine voice, video, images, and text for rich interactions Creative Tools Storytelling experiences and social companions that engage users Business Solutions Customer intake flows and support bots for automated business processes Complex Flows Structured conversations using Pipecat Flows for managing complex interactions ​ How It Works The flow of interactions in a Pipecat application is typically straightforward: The bot says something The user says something The bot says something The user says something This continues until the conversation naturally ends. While this flow seems simple, making it feel natural requires sophisticated real-time processing. ​ Real-time Processing Pipecat’s pipeline architecture handles both simple voice interactions and complex multimodal processing. Let’s look at how data flows through the system: Voice app Multimodal app 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio and Video Transmit and capture audio, video, and image inputs simultaneously 2 Process Streams Handle multiple input streams in parallel 3 Model Processing Send combined inputs to multimodal models (like GPT-4V) 4 Generate Outputs Create various outputs (text, images, audio, etc.) 5 Coordinate Presentation Synchronize and present multiple output types In both cases, Pipecat: Processes responses as they stream in Handles multiple input/output modalities concurrently Manages resource allocation and synchronization Coordinates parallel processing tasks This architecture creates fluid, natural interactions without noticeable delays, whether you’re building a simple voice assistant or a complex multimodal application. Pipecat’s pipeline architecture is particularly valuable for managing the complexity of real-time, multimodal interactions, ensuring smooth data flow and proper synchronization regardless of the input/output types involved. Pipecat handles all this complexity for you, letting you focus on building your application rather than managing the underlying infrastructure. ​ Next Steps Ready to build your first Pipecat application? Installation & Setup Prepare your environment and install required dependencies Quickstart Build and run your first Pipecat application Core Concepts Learn about pipelines, frames, and real-time processing Use Cases Explore example implementations and patterns ​ Join Our Community Discord Community Connect with other developers, share your projects, and get support from the Pipecat team. Installation & Setup On this page What You Can Build How It Works Real-time Processing Next Steps Join Our Community Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/_sources_indexrsttxt_eb6b9580.txt b/_sources_indexrsttxt_eb6b9580.txt new file mode 100644 index 0000000000000000000000000000000000000000..d3a9d003a1c46b11fc93eb07f4c25bf8b3c143a2 --- /dev/null +++ b/_sources_indexrsttxt_eb6b9580.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/links/_sources/index.rst.txt +Title: Overview - Pipecat +================================================== + +Overview - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Get Started Overview Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Get Started Overview Installation & Setup Quickstart Core Concepts Next Steps & Examples Pipecat is an open source Python framework that handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions. “Multimodal” means you can use any combination of audio, video, images, and/or text in your interactions. And “real-time” means that things are happening quickly enough that it feels conversational—a “back-and-forth” with a bot, not submitting a query and waiting for results. ​ What You Can Build Voice Assistants Natural, real-time conversations with AI using speech recognition and synthesis Interactive Agents Personal coaches and meeting assistants that can understand context and provide guidance Multimodal Apps Applications that combine voice, video, images, and text for rich interactions Creative Tools Storytelling experiences and social companions that engage users Business Solutions Customer intake flows and support bots for automated business processes Complex Flows Structured conversations using Pipecat Flows for managing complex interactions ​ How It Works The flow of interactions in a Pipecat application is typically straightforward: The bot says something The user says something The bot says something The user says something This continues until the conversation naturally ends. While this flow seems simple, making it feel natural requires sophisticated real-time processing. ​ Real-time Processing Pipecat’s pipeline architecture handles both simple voice interactions and complex multimodal processing. Let’s look at how data flows through the system: Voice app Multimodal app 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio and Video Transmit and capture audio, video, and image inputs simultaneously 2 Process Streams Handle multiple input streams in parallel 3 Model Processing Send combined inputs to multimodal models (like GPT-4V) 4 Generate Outputs Create various outputs (text, images, audio, etc.) 5 Coordinate Presentation Synchronize and present multiple output types In both cases, Pipecat: Processes responses as they stream in Handles multiple input/output modalities concurrently Manages resource allocation and synchronization Coordinates parallel processing tasks This architecture creates fluid, natural interactions without noticeable delays, whether you’re building a simple voice assistant or a complex multimodal application. Pipecat’s pipeline architecture is particularly valuable for managing the complexity of real-time, multimodal interactions, ensuring smooth data flow and proper synchronization regardless of the input/output types involved. Pipecat handles all this complexity for you, letting you focus on building your application rather than managing the underlying infrastructure. ​ Next Steps Ready to build your first Pipecat application? Installation & Setup Prepare your environment and install required dependencies Quickstart Build and run your first Pipecat application Core Concepts Learn about pipelines, frames, and real-time processing Use Cases Explore example implementations and patterns ​ Join Our Community Discord Community Connect with other developers, share your projects, and get support from the Pipecat team. Installation & Setup On this page What You Can Build How It Works Real-time Processing Next Steps Join Our Community Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/_sources_indexrsttxt_ef601814.txt b/_sources_indexrsttxt_ef601814.txt new file mode 100644 index 0000000000000000000000000000000000000000..4cef9877f2676bb053bae8a0f1350e30001e747d --- /dev/null +++ b/_sources_indexrsttxt_ef601814.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/links/_sources/index.rst.txt#join-our-community +Title: Overview - Pipecat +================================================== + +Overview - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Get Started Overview Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Get Started Overview Installation & Setup Quickstart Core Concepts Next Steps & Examples Pipecat is an open source Python framework that handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions. “Multimodal” means you can use any combination of audio, video, images, and/or text in your interactions. And “real-time” means that things are happening quickly enough that it feels conversational—a “back-and-forth” with a bot, not submitting a query and waiting for results. ​ What You Can Build Voice Assistants Natural, real-time conversations with AI using speech recognition and synthesis Interactive Agents Personal coaches and meeting assistants that can understand context and provide guidance Multimodal Apps Applications that combine voice, video, images, and text for rich interactions Creative Tools Storytelling experiences and social companions that engage users Business Solutions Customer intake flows and support bots for automated business processes Complex Flows Structured conversations using Pipecat Flows for managing complex interactions ​ How It Works The flow of interactions in a Pipecat application is typically straightforward: The bot says something The user says something The bot says something The user says something This continues until the conversation naturally ends. While this flow seems simple, making it feel natural requires sophisticated real-time processing. ​ Real-time Processing Pipecat’s pipeline architecture handles both simple voice interactions and complex multimodal processing. Let’s look at how data flows through the system: Voice app Multimodal app 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio and Video Transmit and capture audio, video, and image inputs simultaneously 2 Process Streams Handle multiple input streams in parallel 3 Model Processing Send combined inputs to multimodal models (like GPT-4V) 4 Generate Outputs Create various outputs (text, images, audio, etc.) 5 Coordinate Presentation Synchronize and present multiple output types In both cases, Pipecat: Processes responses as they stream in Handles multiple input/output modalities concurrently Manages resource allocation and synchronization Coordinates parallel processing tasks This architecture creates fluid, natural interactions without noticeable delays, whether you’re building a simple voice assistant or a complex multimodal application. Pipecat’s pipeline architecture is particularly valuable for managing the complexity of real-time, multimodal interactions, ensuring smooth data flow and proper synchronization regardless of the input/output types involved. Pipecat handles all this complexity for you, letting you focus on building your application rather than managing the underlying infrastructure. ​ Next Steps Ready to build your first Pipecat application? Installation & Setup Prepare your environment and install required dependencies Quickstart Build and run your first Pipecat application Core Concepts Learn about pipelines, frames, and real-time processing Use Cases Explore example implementations and patterns ​ Join Our Community Discord Community Connect with other developers, share your projects, and get support from the Pipecat team. Installation & Setup On this page What You Can Build How It Works Real-time Processing Next Steps Join Our Community Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/analytics_sentry_d30af7e0.txt b/analytics_sentry_d30af7e0.txt new file mode 100644 index 0000000000000000000000000000000000000000..f29278fc9aa51b5f8b8b6e02ee051dceb912295d --- /dev/null +++ b/analytics_sentry_d30af7e0.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/analytics/sentry#installation +Title: Sentry Metrics - Pipecat +================================================== + +Sentry Metrics - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Analytics & Monitoring Sentry Metrics Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Sentry Metrics Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview SentryMetrics extends FrameProcessorMetrics to provide performance monitoring integration with Sentry. It tracks Time to First Byte (TTFB) and processing duration metrics for frame processors. ​ Installation To use Sentry metrics, install the Sentry SDK: Copy Ask AI pip install "pipecat-ai[sentry]" ​ Configuration Sentry must be initialized in your application before metrics will be collected: Copy Ask AI import sentry_sdk sentry_sdk.init( dsn = "your-sentry-dsn" , traces_sample_rate = 1.0 , ) ​ Usage Example Copy Ask AI import sentry_sdk from pipecat.services.openai.llm import OpenAILLMService from pipecat.services.elevenlabs.tts import ElevenLabsTTSService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.processors.metrics.sentry import SentryMetrics from pipecat.transports.services.daily import DailyParams, DailyTransport async def create_metrics_pipeline (): sentry_sdk.init( dsn = "your-sentry-dsn" , traces_sample_rate = 1.0 , ) transport = DailyTransport( room_url, token, "Chatbot" , DailyParams( audio_out_enabled = True , audio_in_enabled = True , video_out_enabled = False , vad_analyzer = SileroVADAnalyzer(), transcription_enabled = True , ), ) tts = ElevenLabsTTSService( api_key = os.getenv( "ELEVENLABS_API_KEY" ), metrics = SentryMetrics(), ) llm = OpenAILLMService( api_key = os.getenv( "OPENAI_API_KEY" ), model = "gpt-4o" ), metrics = SentryMetrics(), ) messages = [ { "role" : "system" , "content" : "You are Chatbot, a friendly, helpful robot. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way, but keep your responses brief. Start by introducing yourself. Keep all your responses to 12 words or fewer." , }, ] context = OpenAILLMContext(messages) context_aggregator = llm.create_context_aggregator(context) # Use in pipeline pipeline = Pipeline([ transport.input(), context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant(), ]) ​ Transaction Information Each transaction includes: Operation type ( ttfb or processing ) Description with processor name Start timestamp End timestamp Unique transaction ID ​ Fallback Behavior If Sentry is not available (not installed or not initialized): Warning logs are generated Metric methods execute without error No data is sent to Sentry ​ Notes Requires Sentry SDK to be installed and initialized Thread-safe metric collection Automatic transaction management Supports selective TTFB reporting Integrates with Sentry’s performance monitoring Provides detailed timing information Maintains timing data even when Sentry is unavailable Moondream Producer & Consumer Processors On this page Overview Installation Configuration Usage Example Transaction Information Fallback Behavior Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/android_api-reference_9763bcbe.txt b/android_api-reference_9763bcbe.txt new file mode 100644 index 0000000000000000000000000000000000000000..211b20ac2bd072daaef656b394ec4c4550243064 --- /dev/null +++ b/android_api-reference_9763bcbe.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/client/android/api-reference +Title: All modules +================================================== + +All modules All modules: pipecat-client-android Link copied to clipboard pipecat-transport-daily Link copied to clipboard pipecat-transport-gemini-live-websocket Link copied to clipboard pipecat-transport-openai-realtime-webrtc Link copied to clipboard © 2025 Copyright Generated by dokka \ No newline at end of file diff --git a/audio_audio-buffer-processor_94494faa.txt b/audio_audio-buffer-processor_94494faa.txt new file mode 100644 index 0000000000000000000000000000000000000000..8e808a4eca9ea56b35ee75c57a4da39e68022581 --- /dev/null +++ b/audio_audio-buffer-processor_94494faa.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/audio-buffer-processor#start-recording +Title: AudioBufferProcessor - Pipecat +================================================== + +AudioBufferProcessor - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing AudioBufferProcessor Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview The AudioBufferProcessor captures and buffers audio frames from both input (user) and output (bot) sources during conversations. It provides synchronized audio streams with configurable sample rates, supports both mono and stereo output, and offers flexible event handlers for various audio processing workflows. ​ Constructor Copy Ask AI AudioBufferProcessor( sample_rate = None , num_channels = 1 , buffer_size = 0 , enable_turn_audio = False , ** kwargs ) ​ Parameters ​ sample_rate Optional[int] default: "None" The desired output sample rate in Hz. If None , uses the transport’s sample rate from the StartFrame . ​ num_channels int default: "1" Number of output audio channels: 1 : Mono output (user and bot audio are mixed together) 2 : Stereo output (user audio on left channel, bot audio on right channel) ​ buffer_size int default: "0" Buffer size in bytes that triggers audio data events: 0 : Events only trigger when recording stops >0 : Events trigger whenever buffer reaches this size (useful for chunked processing) ​ enable_turn_audio bool default: "False" Whether to enable per-turn audio event handlers ( on_user_turn_audio_data and on_bot_turn_audio_data ). ​ Properties ​ sample_rate Copy Ask AI @ property def sample_rate ( self ) -> int The current sample rate of the audio processor in Hz. ​ num_channels Copy Ask AI @ property def num_channels ( self ) -> int The number of channels in the audio output (1 for mono, 2 for stereo). ​ Methods ​ start_recording() Copy Ask AI async def start_recording () Start recording audio from both user and bot sources. Initializes recording state and resets audio buffers. ​ stop_recording() Copy Ask AI async def stop_recording () Stop recording and trigger final audio data handlers with any remaining buffered audio. ​ has_audio() Copy Ask AI def has_audio () -> bool Check if both user and bot audio buffers contain data. Returns: True if both buffers contain audio data. ​ Event Handlers The processor supports multiple event handlers for different audio processing workflows. Register handlers using the @processor.event_handler() decorator. ​ on_audio_data Triggered when buffer_size is reached or recording stops, providing merged audio. Copy Ask AI @audiobuffer.event_handler ( "on_audio_data" ) async def on_audio_data ( buffer , audio : bytes , sample_rate : int , num_channels : int ): # Handle merged audio data pass Parameters: buffer : The AudioBufferProcessor instance audio : Merged audio data (format depends on num_channels setting) sample_rate : Sample rate in Hz num_channels : Number of channels (1 or 2) ​ on_track_audio_data Triggered alongside on_audio_data , providing separate user and bot audio tracks. Copy Ask AI @audiobuffer.event_handler ( "on_track_audio_data" ) async def on_track_audio_data ( buffer , user_audio : bytes , bot_audio : bytes , sample_rate : int , num_channels : int ): # Handle separate audio tracks pass Parameters: buffer : The AudioBufferProcessor instance user_audio : Raw user audio bytes (always mono) bot_audio : Raw bot audio bytes (always mono) sample_rate : Sample rate in Hz num_channels : Always 1 for individual tracks ​ on_user_turn_audio_data Triggered when a user speaking turn ends. Requires enable_turn_audio=True . Copy Ask AI @audiobuffer.event_handler ( "on_user_turn_audio_data" ) async def on_user_turn_audio_data ( buffer , audio : bytes , sample_rate : int , num_channels : int ): # Handle user turn audio pass Parameters: buffer : The AudioBufferProcessor instance audio : Audio data from the user’s speaking turn sample_rate : Sample rate in Hz num_channels : Always 1 (mono) ​ on_bot_turn_audio_data Triggered when a bot speaking turn ends. Requires enable_turn_audio=True . Copy Ask AI @audiobuffer.event_handler ( "on_bot_turn_audio_data" ) async def on_bot_turn_audio_data ( buffer , audio : bytes , sample_rate : int , num_channels : int ): # Handle bot turn audio pass Parameters: buffer : The AudioBufferProcessor instance audio : Audio data from the bot’s speaking turn sample_rate : Sample rate in Hz num_channels : Always 1 (mono) ​ Audio Processing Features Automatic resampling : Converts incoming audio to the specified sample rate Buffer synchronization : Aligns user and bot audio streams temporally Silence insertion : Fills gaps in non-continuous audio streams to maintain timing Turn tracking : Monitors speaking turns when enable_turn_audio=True ​ Integration Notes ​ STT Audio Passthrough If using an STT service in your pipeline, enable audio passthrough to make audio available to the AudioBufferProcessor: Copy Ask AI stt = DeepgramSTTService( api_key = os.getenv( "DEEPGRAM_API_KEY" ), audio_passthrough = True , ) audio_passthrough is enabled by default. ​ Pipeline Placement Add the AudioBufferProcessor after transport.output() to capture both user and bot audio: Copy Ask AI pipeline = Pipeline([ transport.input(), # ... other processors ... transport.output(), audiobuffer, # Place after audio output # ... remaining processors ... ]) UserIdleProcessor KoalaFilter On this page Overview Constructor Parameters Properties sample_rate num_channels Methods start_recording() stop_recording() has_audio() Event Handlers on_audio_data on_track_audio_data on_user_turn_audio_data on_bot_turn_audio_data Audio Processing Features Integration Notes STT Audio Passthrough Pipeline Placement Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_audio-buffer-processor_9949bd98.txt b/audio_audio-buffer-processor_9949bd98.txt new file mode 100644 index 0000000000000000000000000000000000000000..2aaea547f4aec050cc32e412674c7eae3b753bfc --- /dev/null +++ b/audio_audio-buffer-processor_9949bd98.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/audio-buffer-processor#param-buffer-size +Title: AudioBufferProcessor - Pipecat +================================================== + +AudioBufferProcessor - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing AudioBufferProcessor Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview The AudioBufferProcessor captures and buffers audio frames from both input (user) and output (bot) sources during conversations. It provides synchronized audio streams with configurable sample rates, supports both mono and stereo output, and offers flexible event handlers for various audio processing workflows. ​ Constructor Copy Ask AI AudioBufferProcessor( sample_rate = None , num_channels = 1 , buffer_size = 0 , enable_turn_audio = False , ** kwargs ) ​ Parameters ​ sample_rate Optional[int] default: "None" The desired output sample rate in Hz. If None , uses the transport’s sample rate from the StartFrame . ​ num_channels int default: "1" Number of output audio channels: 1 : Mono output (user and bot audio are mixed together) 2 : Stereo output (user audio on left channel, bot audio on right channel) ​ buffer_size int default: "0" Buffer size in bytes that triggers audio data events: 0 : Events only trigger when recording stops >0 : Events trigger whenever buffer reaches this size (useful for chunked processing) ​ enable_turn_audio bool default: "False" Whether to enable per-turn audio event handlers ( on_user_turn_audio_data and on_bot_turn_audio_data ). ​ Properties ​ sample_rate Copy Ask AI @ property def sample_rate ( self ) -> int The current sample rate of the audio processor in Hz. ​ num_channels Copy Ask AI @ property def num_channels ( self ) -> int The number of channels in the audio output (1 for mono, 2 for stereo). ​ Methods ​ start_recording() Copy Ask AI async def start_recording () Start recording audio from both user and bot sources. Initializes recording state and resets audio buffers. ​ stop_recording() Copy Ask AI async def stop_recording () Stop recording and trigger final audio data handlers with any remaining buffered audio. ​ has_audio() Copy Ask AI def has_audio () -> bool Check if both user and bot audio buffers contain data. Returns: True if both buffers contain audio data. ​ Event Handlers The processor supports multiple event handlers for different audio processing workflows. Register handlers using the @processor.event_handler() decorator. ​ on_audio_data Triggered when buffer_size is reached or recording stops, providing merged audio. Copy Ask AI @audiobuffer.event_handler ( "on_audio_data" ) async def on_audio_data ( buffer , audio : bytes , sample_rate : int , num_channels : int ): # Handle merged audio data pass Parameters: buffer : The AudioBufferProcessor instance audio : Merged audio data (format depends on num_channels setting) sample_rate : Sample rate in Hz num_channels : Number of channels (1 or 2) ​ on_track_audio_data Triggered alongside on_audio_data , providing separate user and bot audio tracks. Copy Ask AI @audiobuffer.event_handler ( "on_track_audio_data" ) async def on_track_audio_data ( buffer , user_audio : bytes , bot_audio : bytes , sample_rate : int , num_channels : int ): # Handle separate audio tracks pass Parameters: buffer : The AudioBufferProcessor instance user_audio : Raw user audio bytes (always mono) bot_audio : Raw bot audio bytes (always mono) sample_rate : Sample rate in Hz num_channels : Always 1 for individual tracks ​ on_user_turn_audio_data Triggered when a user speaking turn ends. Requires enable_turn_audio=True . Copy Ask AI @audiobuffer.event_handler ( "on_user_turn_audio_data" ) async def on_user_turn_audio_data ( buffer , audio : bytes , sample_rate : int , num_channels : int ): # Handle user turn audio pass Parameters: buffer : The AudioBufferProcessor instance audio : Audio data from the user’s speaking turn sample_rate : Sample rate in Hz num_channels : Always 1 (mono) ​ on_bot_turn_audio_data Triggered when a bot speaking turn ends. Requires enable_turn_audio=True . Copy Ask AI @audiobuffer.event_handler ( "on_bot_turn_audio_data" ) async def on_bot_turn_audio_data ( buffer , audio : bytes , sample_rate : int , num_channels : int ): # Handle bot turn audio pass Parameters: buffer : The AudioBufferProcessor instance audio : Audio data from the bot’s speaking turn sample_rate : Sample rate in Hz num_channels : Always 1 (mono) ​ Audio Processing Features Automatic resampling : Converts incoming audio to the specified sample rate Buffer synchronization : Aligns user and bot audio streams temporally Silence insertion : Fills gaps in non-continuous audio streams to maintain timing Turn tracking : Monitors speaking turns when enable_turn_audio=True ​ Integration Notes ​ STT Audio Passthrough If using an STT service in your pipeline, enable audio passthrough to make audio available to the AudioBufferProcessor: Copy Ask AI stt = DeepgramSTTService( api_key = os.getenv( "DEEPGRAM_API_KEY" ), audio_passthrough = True , ) audio_passthrough is enabled by default. ​ Pipeline Placement Add the AudioBufferProcessor after transport.output() to capture both user and bot audio: Copy Ask AI pipeline = Pipeline([ transport.input(), # ... other processors ... transport.output(), audiobuffer, # Place after audio output # ... remaining processors ... ]) UserIdleProcessor KoalaFilter On this page Overview Constructor Parameters Properties sample_rate num_channels Methods start_recording() stop_recording() has_audio() Event Handlers on_audio_data on_track_audio_data on_user_turn_audio_data on_bot_turn_audio_data Audio Processing Features Integration Notes STT Audio Passthrough Pipeline Placement Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_audio-buffer-processor_a965e6e4.txt b/audio_audio-buffer-processor_a965e6e4.txt new file mode 100644 index 0000000000000000000000000000000000000000..b1a9e6c24b0db7523a3c92758ad93c7cf7d8da77 --- /dev/null +++ b/audio_audio-buffer-processor_a965e6e4.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/audio-buffer-processor#pipeline-placement +Title: AudioBufferProcessor - Pipecat +================================================== + +AudioBufferProcessor - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing AudioBufferProcessor Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview The AudioBufferProcessor captures and buffers audio frames from both input (user) and output (bot) sources during conversations. It provides synchronized audio streams with configurable sample rates, supports both mono and stereo output, and offers flexible event handlers for various audio processing workflows. ​ Constructor Copy Ask AI AudioBufferProcessor( sample_rate = None , num_channels = 1 , buffer_size = 0 , enable_turn_audio = False , ** kwargs ) ​ Parameters ​ sample_rate Optional[int] default: "None" The desired output sample rate in Hz. If None , uses the transport’s sample rate from the StartFrame . ​ num_channels int default: "1" Number of output audio channels: 1 : Mono output (user and bot audio are mixed together) 2 : Stereo output (user audio on left channel, bot audio on right channel) ​ buffer_size int default: "0" Buffer size in bytes that triggers audio data events: 0 : Events only trigger when recording stops >0 : Events trigger whenever buffer reaches this size (useful for chunked processing) ​ enable_turn_audio bool default: "False" Whether to enable per-turn audio event handlers ( on_user_turn_audio_data and on_bot_turn_audio_data ). ​ Properties ​ sample_rate Copy Ask AI @ property def sample_rate ( self ) -> int The current sample rate of the audio processor in Hz. ​ num_channels Copy Ask AI @ property def num_channels ( self ) -> int The number of channels in the audio output (1 for mono, 2 for stereo). ​ Methods ​ start_recording() Copy Ask AI async def start_recording () Start recording audio from both user and bot sources. Initializes recording state and resets audio buffers. ​ stop_recording() Copy Ask AI async def stop_recording () Stop recording and trigger final audio data handlers with any remaining buffered audio. ​ has_audio() Copy Ask AI def has_audio () -> bool Check if both user and bot audio buffers contain data. Returns: True if both buffers contain audio data. ​ Event Handlers The processor supports multiple event handlers for different audio processing workflows. Register handlers using the @processor.event_handler() decorator. ​ on_audio_data Triggered when buffer_size is reached or recording stops, providing merged audio. Copy Ask AI @audiobuffer.event_handler ( "on_audio_data" ) async def on_audio_data ( buffer , audio : bytes , sample_rate : int , num_channels : int ): # Handle merged audio data pass Parameters: buffer : The AudioBufferProcessor instance audio : Merged audio data (format depends on num_channels setting) sample_rate : Sample rate in Hz num_channels : Number of channels (1 or 2) ​ on_track_audio_data Triggered alongside on_audio_data , providing separate user and bot audio tracks. Copy Ask AI @audiobuffer.event_handler ( "on_track_audio_data" ) async def on_track_audio_data ( buffer , user_audio : bytes , bot_audio : bytes , sample_rate : int , num_channels : int ): # Handle separate audio tracks pass Parameters: buffer : The AudioBufferProcessor instance user_audio : Raw user audio bytes (always mono) bot_audio : Raw bot audio bytes (always mono) sample_rate : Sample rate in Hz num_channels : Always 1 for individual tracks ​ on_user_turn_audio_data Triggered when a user speaking turn ends. Requires enable_turn_audio=True . Copy Ask AI @audiobuffer.event_handler ( "on_user_turn_audio_data" ) async def on_user_turn_audio_data ( buffer , audio : bytes , sample_rate : int , num_channels : int ): # Handle user turn audio pass Parameters: buffer : The AudioBufferProcessor instance audio : Audio data from the user’s speaking turn sample_rate : Sample rate in Hz num_channels : Always 1 (mono) ​ on_bot_turn_audio_data Triggered when a bot speaking turn ends. Requires enable_turn_audio=True . Copy Ask AI @audiobuffer.event_handler ( "on_bot_turn_audio_data" ) async def on_bot_turn_audio_data ( buffer , audio : bytes , sample_rate : int , num_channels : int ): # Handle bot turn audio pass Parameters: buffer : The AudioBufferProcessor instance audio : Audio data from the bot’s speaking turn sample_rate : Sample rate in Hz num_channels : Always 1 (mono) ​ Audio Processing Features Automatic resampling : Converts incoming audio to the specified sample rate Buffer synchronization : Aligns user and bot audio streams temporally Silence insertion : Fills gaps in non-continuous audio streams to maintain timing Turn tracking : Monitors speaking turns when enable_turn_audio=True ​ Integration Notes ​ STT Audio Passthrough If using an STT service in your pipeline, enable audio passthrough to make audio available to the AudioBufferProcessor: Copy Ask AI stt = DeepgramSTTService( api_key = os.getenv( "DEEPGRAM_API_KEY" ), audio_passthrough = True , ) audio_passthrough is enabled by default. ​ Pipeline Placement Add the AudioBufferProcessor after transport.output() to capture both user and bot audio: Copy Ask AI pipeline = Pipeline([ transport.input(), # ... other processors ... transport.output(), audiobuffer, # Place after audio output # ... remaining processors ... ]) UserIdleProcessor KoalaFilter On this page Overview Constructor Parameters Properties sample_rate num_channels Methods start_recording() stop_recording() has_audio() Event Handlers on_audio_data on_track_audio_data on_user_turn_audio_data on_bot_turn_audio_data Audio Processing Features Integration Notes STT Audio Passthrough Pipeline Placement Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_audio-buffer-processor_f9116c60.txt b/audio_audio-buffer-processor_f9116c60.txt new file mode 100644 index 0000000000000000000000000000000000000000..6c12fe45d615bd3a33d78ee277c33a711248c9f5 --- /dev/null +++ b/audio_audio-buffer-processor_f9116c60.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/audio-buffer-processor#num-channels +Title: AudioBufferProcessor - Pipecat +================================================== + +AudioBufferProcessor - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing AudioBufferProcessor Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview The AudioBufferProcessor captures and buffers audio frames from both input (user) and output (bot) sources during conversations. It provides synchronized audio streams with configurable sample rates, supports both mono and stereo output, and offers flexible event handlers for various audio processing workflows. ​ Constructor Copy Ask AI AudioBufferProcessor( sample_rate = None , num_channels = 1 , buffer_size = 0 , enable_turn_audio = False , ** kwargs ) ​ Parameters ​ sample_rate Optional[int] default: "None" The desired output sample rate in Hz. If None , uses the transport’s sample rate from the StartFrame . ​ num_channels int default: "1" Number of output audio channels: 1 : Mono output (user and bot audio are mixed together) 2 : Stereo output (user audio on left channel, bot audio on right channel) ​ buffer_size int default: "0" Buffer size in bytes that triggers audio data events: 0 : Events only trigger when recording stops >0 : Events trigger whenever buffer reaches this size (useful for chunked processing) ​ enable_turn_audio bool default: "False" Whether to enable per-turn audio event handlers ( on_user_turn_audio_data and on_bot_turn_audio_data ). ​ Properties ​ sample_rate Copy Ask AI @ property def sample_rate ( self ) -> int The current sample rate of the audio processor in Hz. ​ num_channels Copy Ask AI @ property def num_channels ( self ) -> int The number of channels in the audio output (1 for mono, 2 for stereo). ​ Methods ​ start_recording() Copy Ask AI async def start_recording () Start recording audio from both user and bot sources. Initializes recording state and resets audio buffers. ​ stop_recording() Copy Ask AI async def stop_recording () Stop recording and trigger final audio data handlers with any remaining buffered audio. ​ has_audio() Copy Ask AI def has_audio () -> bool Check if both user and bot audio buffers contain data. Returns: True if both buffers contain audio data. ​ Event Handlers The processor supports multiple event handlers for different audio processing workflows. Register handlers using the @processor.event_handler() decorator. ​ on_audio_data Triggered when buffer_size is reached or recording stops, providing merged audio. Copy Ask AI @audiobuffer.event_handler ( "on_audio_data" ) async def on_audio_data ( buffer , audio : bytes , sample_rate : int , num_channels : int ): # Handle merged audio data pass Parameters: buffer : The AudioBufferProcessor instance audio : Merged audio data (format depends on num_channels setting) sample_rate : Sample rate in Hz num_channels : Number of channels (1 or 2) ​ on_track_audio_data Triggered alongside on_audio_data , providing separate user and bot audio tracks. Copy Ask AI @audiobuffer.event_handler ( "on_track_audio_data" ) async def on_track_audio_data ( buffer , user_audio : bytes , bot_audio : bytes , sample_rate : int , num_channels : int ): # Handle separate audio tracks pass Parameters: buffer : The AudioBufferProcessor instance user_audio : Raw user audio bytes (always mono) bot_audio : Raw bot audio bytes (always mono) sample_rate : Sample rate in Hz num_channels : Always 1 for individual tracks ​ on_user_turn_audio_data Triggered when a user speaking turn ends. Requires enable_turn_audio=True . Copy Ask AI @audiobuffer.event_handler ( "on_user_turn_audio_data" ) async def on_user_turn_audio_data ( buffer , audio : bytes , sample_rate : int , num_channels : int ): # Handle user turn audio pass Parameters: buffer : The AudioBufferProcessor instance audio : Audio data from the user’s speaking turn sample_rate : Sample rate in Hz num_channels : Always 1 (mono) ​ on_bot_turn_audio_data Triggered when a bot speaking turn ends. Requires enable_turn_audio=True . Copy Ask AI @audiobuffer.event_handler ( "on_bot_turn_audio_data" ) async def on_bot_turn_audio_data ( buffer , audio : bytes , sample_rate : int , num_channels : int ): # Handle bot turn audio pass Parameters: buffer : The AudioBufferProcessor instance audio : Audio data from the bot’s speaking turn sample_rate : Sample rate in Hz num_channels : Always 1 (mono) ​ Audio Processing Features Automatic resampling : Converts incoming audio to the specified sample rate Buffer synchronization : Aligns user and bot audio streams temporally Silence insertion : Fills gaps in non-continuous audio streams to maintain timing Turn tracking : Monitors speaking turns when enable_turn_audio=True ​ Integration Notes ​ STT Audio Passthrough If using an STT service in your pipeline, enable audio passthrough to make audio available to the AudioBufferProcessor: Copy Ask AI stt = DeepgramSTTService( api_key = os.getenv( "DEEPGRAM_API_KEY" ), audio_passthrough = True , ) audio_passthrough is enabled by default. ​ Pipeline Placement Add the AudioBufferProcessor after transport.output() to capture both user and bot audio: Copy Ask AI pipeline = Pipeline([ transport.input(), # ... other processors ... transport.output(), audiobuffer, # Place after audio output # ... remaining processors ... ]) UserIdleProcessor KoalaFilter On this page Overview Constructor Parameters Properties sample_rate num_channels Methods start_recording() stop_recording() has_audio() Event Handlers on_audio_data on_track_audio_data on_user_turn_audio_data on_bot_turn_audio_data Audio Processing Features Integration Notes STT Audio Passthrough Pipeline Placement Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_koala-filter_4cd6e2cf.txt b/audio_koala-filter_4cd6e2cf.txt new file mode 100644 index 0000000000000000000000000000000000000000..7730261c21973f43e39fc55b4ee1c02bd66ca5e8 --- /dev/null +++ b/audio_koala-filter_4cd6e2cf.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/koala-filter +Title: KoalaFilter - Pipecat +================================================== + +KoalaFilter - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing KoalaFilter Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview KoalaFilter is an audio processor that reduces background noise in real-time audio streams using Koala Noise Suppression technology from Picovoice. It inherits from BaseAudioFilter and processes audio frames to improve audio quality by removing unwanted noise. To use Koala, you need a Picovoice access key. Get started at Picovoice Console . ​ Installation The Koala filter requires additional dependencies: Copy Ask AI pip install "pipecat-ai[koala]" You’ll also need to set up your Koala access key as an environment variable: KOALA_ACCESS_KEY ​ Constructor Parameters ​ access_key str required Picovoice access key for using the Koala noise suppression service ​ Input Frames ​ FilterEnableFrame Frame Specific control frame to toggle filtering on/off Copy Ask AI from pipecat.frames.frames import FilterEnableFrame # Disable noise reduction await task.queue_frame(FilterEnableFrame( False )) # Re-enable noise reduction await task.queue_frame(FilterEnableFrame( True )) ​ Usage Example Copy Ask AI from pipecat.audio.filters.koala_filter import KoalaFilter transport = DailyTransport( room_url, token, "Respond bot" , DailyParams( audio_in_filter = KoalaFilter( access_key = os.getenv( "KOALA_ACCESS_KEY" )), # Enable Koala noise reduction audio_in_enabled = True , audio_out_enabled = True , vad_analyzer = SileroVADAnalyzer(), ), ) ​ Audio Flow ​ Notes Requires Picovoice access key Supports real-time audio processing Handles 16-bit PCM audio format Can be dynamically enabled/disabled Maintains audio quality while reducing noise Efficient processing for low latency Automatically handles audio frame buffering Sample rate must match Koala’s required sample rate AudioBufferProcessor KrispFilter On this page Overview Installation Constructor Parameters Input Frames Usage Example Audio Flow Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_koala-filter_bef194f4.txt b/audio_koala-filter_bef194f4.txt new file mode 100644 index 0000000000000000000000000000000000000000..2b84513326cfdd12b3f084d18ec04290da45166b --- /dev/null +++ b/audio_koala-filter_bef194f4.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/koala-filter#audio-flow +Title: KoalaFilter - Pipecat +================================================== + +KoalaFilter - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing KoalaFilter Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview KoalaFilter is an audio processor that reduces background noise in real-time audio streams using Koala Noise Suppression technology from Picovoice. It inherits from BaseAudioFilter and processes audio frames to improve audio quality by removing unwanted noise. To use Koala, you need a Picovoice access key. Get started at Picovoice Console . ​ Installation The Koala filter requires additional dependencies: Copy Ask AI pip install "pipecat-ai[koala]" You’ll also need to set up your Koala access key as an environment variable: KOALA_ACCESS_KEY ​ Constructor Parameters ​ access_key str required Picovoice access key for using the Koala noise suppression service ​ Input Frames ​ FilterEnableFrame Frame Specific control frame to toggle filtering on/off Copy Ask AI from pipecat.frames.frames import FilterEnableFrame # Disable noise reduction await task.queue_frame(FilterEnableFrame( False )) # Re-enable noise reduction await task.queue_frame(FilterEnableFrame( True )) ​ Usage Example Copy Ask AI from pipecat.audio.filters.koala_filter import KoalaFilter transport = DailyTransport( room_url, token, "Respond bot" , DailyParams( audio_in_filter = KoalaFilter( access_key = os.getenv( "KOALA_ACCESS_KEY" )), # Enable Koala noise reduction audio_in_enabled = True , audio_out_enabled = True , vad_analyzer = SileroVADAnalyzer(), ), ) ​ Audio Flow ​ Notes Requires Picovoice access key Supports real-time audio processing Handles 16-bit PCM audio format Can be dynamically enabled/disabled Maintains audio quality while reducing noise Efficient processing for low latency Automatically handles audio frame buffering Sample rate must match Koala’s required sample rate AudioBufferProcessor KrispFilter On this page Overview Installation Constructor Parameters Input Frames Usage Example Audio Flow Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_koala-filter_e4ba46aa.txt b/audio_koala-filter_e4ba46aa.txt new file mode 100644 index 0000000000000000000000000000000000000000..ba8728dd4e5bd4d4458ba9d9e22eb3508b7a5564 --- /dev/null +++ b/audio_koala-filter_e4ba46aa.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/koala-filter#notes +Title: KoalaFilter - Pipecat +================================================== + +KoalaFilter - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing KoalaFilter Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview KoalaFilter is an audio processor that reduces background noise in real-time audio streams using Koala Noise Suppression technology from Picovoice. It inherits from BaseAudioFilter and processes audio frames to improve audio quality by removing unwanted noise. To use Koala, you need a Picovoice access key. Get started at Picovoice Console . ​ Installation The Koala filter requires additional dependencies: Copy Ask AI pip install "pipecat-ai[koala]" You’ll also need to set up your Koala access key as an environment variable: KOALA_ACCESS_KEY ​ Constructor Parameters ​ access_key str required Picovoice access key for using the Koala noise suppression service ​ Input Frames ​ FilterEnableFrame Frame Specific control frame to toggle filtering on/off Copy Ask AI from pipecat.frames.frames import FilterEnableFrame # Disable noise reduction await task.queue_frame(FilterEnableFrame( False )) # Re-enable noise reduction await task.queue_frame(FilterEnableFrame( True )) ​ Usage Example Copy Ask AI from pipecat.audio.filters.koala_filter import KoalaFilter transport = DailyTransport( room_url, token, "Respond bot" , DailyParams( audio_in_filter = KoalaFilter( access_key = os.getenv( "KOALA_ACCESS_KEY" )), # Enable Koala noise reduction audio_in_enabled = True , audio_out_enabled = True , vad_analyzer = SileroVADAnalyzer(), ), ) ​ Audio Flow ​ Notes Requires Picovoice access key Supports real-time audio processing Handles 16-bit PCM audio format Can be dynamically enabled/disabled Maintains audio quality while reducing noise Efficient processing for low latency Automatically handles audio frame buffering Sample rate must match Koala’s required sample rate AudioBufferProcessor KrispFilter On this page Overview Installation Constructor Parameters Input Frames Usage Example Audio Flow Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_krisp-filter_06341763.txt b/audio_krisp-filter_06341763.txt new file mode 100644 index 0000000000000000000000000000000000000000..61b447c56a0f3ccd766b91dd874085b95f1a14bb --- /dev/null +++ b/audio_krisp-filter_06341763.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/krisp-filter#param-channels +Title: KrispFilter - Pipecat +================================================== + +KrispFilter - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing KrispFilter Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview KrispFilter is an audio processor that reduces background noise in real-time audio streams using Krisp AI technology. It inherits from BaseAudioFilter and processes audio frames to improve audio quality by removing unwanted noise. To use Krisp, you need a Krisp SDK license. Get started at Krisp.ai . Looking for help getting started with Krisp and Pipecat? Checkout our Krisp noise cancellation guide . ​ Installation The Krisp filter requires additional dependencies: Copy Ask AI pip install "pipecat-ai[krisp]" ​ Environment Variables You need to provide the path to the Krisp model. This can either be done by setting the KRISP_MODEL_PATH environment variable or by setting the model_path in the constructor. ​ Constructor Parameters ​ sample_type str default: "PCM_16" Audio sample type format ​ channels int default: "1" Number of audio channels ​ model_path str default: "None" Path to the Krisp model file. You can set the model_path directly. Alternatively, you can set the KRISP_MODEL_PATH environment variable to the model file path. ​ Input Frames ​ FilterEnableFrame Frame Specific control frame to toggle filtering on/off Copy Ask AI from pipecat.frames.frames import FilterEnableFrame # Disable noise reduction await task.queue_frame(FilterEnableFrame( False )) # Re-enable noise reduction await task.queue_frame(FilterEnableFrame( True )) ​ Usage Example Copy Ask AI from pipecat.audio.filters.krisp_filter import KrispFilter transport = DailyTransport( room_url, token, "Respond bot" , DailyParams( audio_in_filter = KrispFilter(), # Enable Krisp noise reduction audio_in_enabled = True , audio_out_enabled = True , vad_analyzer = SileroVADAnalyzer(), ), ) ​ Audio Flow ​ Notes Requires Krisp SDK and model file to be available Supports real-time audio processing Supports additional features like background voice removal Handles PCM_16 audio format Thread-safe for pipeline processing Can be dynamically enabled/disabled Maintains audio quality while reducing noise Efficient processing for low latency KoalaFilter NoisereduceFilter On this page Overview Installation Environment Variables Constructor Parameters Input Frames Usage Example Audio Flow Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_krisp-filter_2d4030bf.txt b/audio_krisp-filter_2d4030bf.txt new file mode 100644 index 0000000000000000000000000000000000000000..28f2b1eccb4f6c893d426cb941bcfb92340ecab5 --- /dev/null +++ b/audio_krisp-filter_2d4030bf.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/krisp-filter#param-filter-enable-frame +Title: KrispFilter - Pipecat +================================================== + +KrispFilter - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing KrispFilter Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview KrispFilter is an audio processor that reduces background noise in real-time audio streams using Krisp AI technology. It inherits from BaseAudioFilter and processes audio frames to improve audio quality by removing unwanted noise. To use Krisp, you need a Krisp SDK license. Get started at Krisp.ai . Looking for help getting started with Krisp and Pipecat? Checkout our Krisp noise cancellation guide . ​ Installation The Krisp filter requires additional dependencies: Copy Ask AI pip install "pipecat-ai[krisp]" ​ Environment Variables You need to provide the path to the Krisp model. This can either be done by setting the KRISP_MODEL_PATH environment variable or by setting the model_path in the constructor. ​ Constructor Parameters ​ sample_type str default: "PCM_16" Audio sample type format ​ channels int default: "1" Number of audio channels ​ model_path str default: "None" Path to the Krisp model file. You can set the model_path directly. Alternatively, you can set the KRISP_MODEL_PATH environment variable to the model file path. ​ Input Frames ​ FilterEnableFrame Frame Specific control frame to toggle filtering on/off Copy Ask AI from pipecat.frames.frames import FilterEnableFrame # Disable noise reduction await task.queue_frame(FilterEnableFrame( False )) # Re-enable noise reduction await task.queue_frame(FilterEnableFrame( True )) ​ Usage Example Copy Ask AI from pipecat.audio.filters.krisp_filter import KrispFilter transport = DailyTransport( room_url, token, "Respond bot" , DailyParams( audio_in_filter = KrispFilter(), # Enable Krisp noise reduction audio_in_enabled = True , audio_out_enabled = True , vad_analyzer = SileroVADAnalyzer(), ), ) ​ Audio Flow ​ Notes Requires Krisp SDK and model file to be available Supports real-time audio processing Supports additional features like background voice removal Handles PCM_16 audio format Thread-safe for pipeline processing Can be dynamically enabled/disabled Maintains audio quality while reducing noise Efficient processing for low latency KoalaFilter NoisereduceFilter On this page Overview Installation Environment Variables Constructor Parameters Input Frames Usage Example Audio Flow Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_krisp-filter_db1f07bd.txt b/audio_krisp-filter_db1f07bd.txt new file mode 100644 index 0000000000000000000000000000000000000000..2f4413136173c3022fa8e055eb4e5d1fec87cf0c --- /dev/null +++ b/audio_krisp-filter_db1f07bd.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/krisp-filter#param-sample-type +Title: KrispFilter - Pipecat +================================================== + +KrispFilter - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing KrispFilter Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview KrispFilter is an audio processor that reduces background noise in real-time audio streams using Krisp AI technology. It inherits from BaseAudioFilter and processes audio frames to improve audio quality by removing unwanted noise. To use Krisp, you need a Krisp SDK license. Get started at Krisp.ai . Looking for help getting started with Krisp and Pipecat? Checkout our Krisp noise cancellation guide . ​ Installation The Krisp filter requires additional dependencies: Copy Ask AI pip install "pipecat-ai[krisp]" ​ Environment Variables You need to provide the path to the Krisp model. This can either be done by setting the KRISP_MODEL_PATH environment variable or by setting the model_path in the constructor. ​ Constructor Parameters ​ sample_type str default: "PCM_16" Audio sample type format ​ channels int default: "1" Number of audio channels ​ model_path str default: "None" Path to the Krisp model file. You can set the model_path directly. Alternatively, you can set the KRISP_MODEL_PATH environment variable to the model file path. ​ Input Frames ​ FilterEnableFrame Frame Specific control frame to toggle filtering on/off Copy Ask AI from pipecat.frames.frames import FilterEnableFrame # Disable noise reduction await task.queue_frame(FilterEnableFrame( False )) # Re-enable noise reduction await task.queue_frame(FilterEnableFrame( True )) ​ Usage Example Copy Ask AI from pipecat.audio.filters.krisp_filter import KrispFilter transport = DailyTransport( room_url, token, "Respond bot" , DailyParams( audio_in_filter = KrispFilter(), # Enable Krisp noise reduction audio_in_enabled = True , audio_out_enabled = True , vad_analyzer = SileroVADAnalyzer(), ), ) ​ Audio Flow ​ Notes Requires Krisp SDK and model file to be available Supports real-time audio processing Supports additional features like background voice removal Handles PCM_16 audio format Thread-safe for pipeline processing Can be dynamically enabled/disabled Maintains audio quality while reducing noise Efficient processing for low latency KoalaFilter NoisereduceFilter On this page Overview Installation Environment Variables Constructor Parameters Input Frames Usage Example Audio Flow Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_krisp-filter_e7d9e305.txt b/audio_krisp-filter_e7d9e305.txt new file mode 100644 index 0000000000000000000000000000000000000000..a69bb519712814b7eb559f71f8eaa57cedb8c029 --- /dev/null +++ b/audio_krisp-filter_e7d9e305.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/api-reference/utilities/audio/krisp-filter#next-steps +Title: Overview - Pipecat +================================================== + +Overview - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Get Started Overview Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Get Started Overview Installation & Setup Quickstart Core Concepts Next Steps & Examples Pipecat is an open source Python framework that handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions. “Multimodal” means you can use any combination of audio, video, images, and/or text in your interactions. And “real-time” means that things are happening quickly enough that it feels conversational—a “back-and-forth” with a bot, not submitting a query and waiting for results. ​ What You Can Build Voice Assistants Natural, real-time conversations with AI using speech recognition and synthesis Interactive Agents Personal coaches and meeting assistants that can understand context and provide guidance Multimodal Apps Applications that combine voice, video, images, and text for rich interactions Creative Tools Storytelling experiences and social companions that engage users Business Solutions Customer intake flows and support bots for automated business processes Complex Flows Structured conversations using Pipecat Flows for managing complex interactions ​ How It Works The flow of interactions in a Pipecat application is typically straightforward: The bot says something The user says something The bot says something The user says something This continues until the conversation naturally ends. While this flow seems simple, making it feel natural requires sophisticated real-time processing. ​ Real-time Processing Pipecat’s pipeline architecture handles both simple voice interactions and complex multimodal processing. Let’s look at how data flows through the system: Voice app Multimodal app 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio and Video Transmit and capture audio, video, and image inputs simultaneously 2 Process Streams Handle multiple input streams in parallel 3 Model Processing Send combined inputs to multimodal models (like GPT-4V) 4 Generate Outputs Create various outputs (text, images, audio, etc.) 5 Coordinate Presentation Synchronize and present multiple output types In both cases, Pipecat: Processes responses as they stream in Handles multiple input/output modalities concurrently Manages resource allocation and synchronization Coordinates parallel processing tasks This architecture creates fluid, natural interactions without noticeable delays, whether you’re building a simple voice assistant or a complex multimodal application. Pipecat’s pipeline architecture is particularly valuable for managing the complexity of real-time, multimodal interactions, ensuring smooth data flow and proper synchronization regardless of the input/output types involved. Pipecat handles all this complexity for you, letting you focus on building your application rather than managing the underlying infrastructure. ​ Next Steps Ready to build your first Pipecat application? Installation & Setup Prepare your environment and install required dependencies Quickstart Build and run your first Pipecat application Core Concepts Learn about pipelines, frames, and real-time processing Use Cases Explore example implementations and patterns ​ Join Our Community Discord Community Connect with other developers, share your projects, and get support from the Pipecat team. Installation & Setup On this page What You Can Build How It Works Real-time Processing Next Steps Join Our Community Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_krisp-filter_e84882be.txt b/audio_krisp-filter_e84882be.txt new file mode 100644 index 0000000000000000000000000000000000000000..c2899fd6d0277319edf8d6bed35560ff9a1d6b98 --- /dev/null +++ b/audio_krisp-filter_e84882be.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/krisp-filter#installation +Title: KrispFilter - Pipecat +================================================== + +KrispFilter - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing KrispFilter Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview KrispFilter is an audio processor that reduces background noise in real-time audio streams using Krisp AI technology. It inherits from BaseAudioFilter and processes audio frames to improve audio quality by removing unwanted noise. To use Krisp, you need a Krisp SDK license. Get started at Krisp.ai . Looking for help getting started with Krisp and Pipecat? Checkout our Krisp noise cancellation guide . ​ Installation The Krisp filter requires additional dependencies: Copy Ask AI pip install "pipecat-ai[krisp]" ​ Environment Variables You need to provide the path to the Krisp model. This can either be done by setting the KRISP_MODEL_PATH environment variable or by setting the model_path in the constructor. ​ Constructor Parameters ​ sample_type str default: "PCM_16" Audio sample type format ​ channels int default: "1" Number of audio channels ​ model_path str default: "None" Path to the Krisp model file. You can set the model_path directly. Alternatively, you can set the KRISP_MODEL_PATH environment variable to the model file path. ​ Input Frames ​ FilterEnableFrame Frame Specific control frame to toggle filtering on/off Copy Ask AI from pipecat.frames.frames import FilterEnableFrame # Disable noise reduction await task.queue_frame(FilterEnableFrame( False )) # Re-enable noise reduction await task.queue_frame(FilterEnableFrame( True )) ​ Usage Example Copy Ask AI from pipecat.audio.filters.krisp_filter import KrispFilter transport = DailyTransport( room_url, token, "Respond bot" , DailyParams( audio_in_filter = KrispFilter(), # Enable Krisp noise reduction audio_in_enabled = True , audio_out_enabled = True , vad_analyzer = SileroVADAnalyzer(), ), ) ​ Audio Flow ​ Notes Requires Krisp SDK and model file to be available Supports real-time audio processing Supports additional features like background voice removal Handles PCM_16 audio format Thread-safe for pipeline processing Can be dynamically enabled/disabled Maintains audio quality while reducing noise Efficient processing for low latency KoalaFilter NoisereduceFilter On this page Overview Installation Environment Variables Constructor Parameters Input Frames Usage Example Audio Flow Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_noisereduce-filter_15677196.txt b/audio_noisereduce-filter_15677196.txt new file mode 100644 index 0000000000000000000000000000000000000000..db3fedd26616240c54db2e4a0f1d661506f7d755 --- /dev/null +++ b/audio_noisereduce-filter_15677196.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/noisereduce-filter#usage-example +Title: NoisereduceFilter - Pipecat +================================================== + +NoisereduceFilter - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing NoisereduceFilter Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview NoisereduceFilter is an audio processor that reduces background noise in real-time audio streams using the noisereduce library. It inherits from BaseAudioFilter and processes audio frames to improve audio quality by removing unwanted noise. ​ Installation The noisereduce filter requires additional dependencies: Copy Ask AI pip install "pipecat-ai[noisereduce]" ​ Constructor Parameters This filter has no configurable parameters in its constructor. ​ Input Frames ​ FilterEnableFrame Frame Specific control frame to toggle filtering on/off Copy Ask AI from pipecat.frames.frames import FilterEnableFrame # Disable noise reduction await task.queue_frame(FilterEnableFrame( False )) # Re-enable noise reduction await task.queue_frame(FilterEnableFrame( True )) ​ Usage Example Copy Ask AI from pipecat.audio.filters.noisereduce_filter import NoisereduceFilter transport = DailyTransport( room_url, token, "Respond bot" , DailyParams( audio_in_filter = NoisereduceFilter(), # Enable noise reduction audio_in_enabled = True , audio_out_enabled = True , vad_analyzer = SileroVADAnalyzer(), ), ) ​ Audio Flow ​ Notes Lightweight alternative to Krisp for noise reduction Supports real-time audio processing Handles PCM_16 audio format Thread-safe for pipeline processing Can be dynamically enabled/disabled No additional configuration required Uses statistical noise reduction techniques KrispFilter SileroVADAnalyzer On this page Overview Installation Constructor Parameters Input Frames Usage Example Audio Flow Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_noisereduce-filter_45a37635.txt b/audio_noisereduce-filter_45a37635.txt new file mode 100644 index 0000000000000000000000000000000000000000..7120990eb54d20ea3b052038d02d5c1274166e9b --- /dev/null +++ b/audio_noisereduce-filter_45a37635.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/noisereduce-filter#constructor-parameters +Title: NoisereduceFilter - Pipecat +================================================== + +NoisereduceFilter - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing NoisereduceFilter Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview NoisereduceFilter is an audio processor that reduces background noise in real-time audio streams using the noisereduce library. It inherits from BaseAudioFilter and processes audio frames to improve audio quality by removing unwanted noise. ​ Installation The noisereduce filter requires additional dependencies: Copy Ask AI pip install "pipecat-ai[noisereduce]" ​ Constructor Parameters This filter has no configurable parameters in its constructor. ​ Input Frames ​ FilterEnableFrame Frame Specific control frame to toggle filtering on/off Copy Ask AI from pipecat.frames.frames import FilterEnableFrame # Disable noise reduction await task.queue_frame(FilterEnableFrame( False )) # Re-enable noise reduction await task.queue_frame(FilterEnableFrame( True )) ​ Usage Example Copy Ask AI from pipecat.audio.filters.noisereduce_filter import NoisereduceFilter transport = DailyTransport( room_url, token, "Respond bot" , DailyParams( audio_in_filter = NoisereduceFilter(), # Enable noise reduction audio_in_enabled = True , audio_out_enabled = True , vad_analyzer = SileroVADAnalyzer(), ), ) ​ Audio Flow ​ Notes Lightweight alternative to Krisp for noise reduction Supports real-time audio processing Handles PCM_16 audio format Thread-safe for pipeline processing Can be dynamically enabled/disabled No additional configuration required Uses statistical noise reduction techniques KrispFilter SileroVADAnalyzer On this page Overview Installation Constructor Parameters Input Frames Usage Example Audio Flow Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_noisereduce-filter_d0480795.txt b/audio_noisereduce-filter_d0480795.txt new file mode 100644 index 0000000000000000000000000000000000000000..5bd039e3f74bac7b75f5e5342d106732e3265ab3 --- /dev/null +++ b/audio_noisereduce-filter_d0480795.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/noisereduce-filter#installation +Title: NoisereduceFilter - Pipecat +================================================== + +NoisereduceFilter - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing NoisereduceFilter Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview NoisereduceFilter is an audio processor that reduces background noise in real-time audio streams using the noisereduce library. It inherits from BaseAudioFilter and processes audio frames to improve audio quality by removing unwanted noise. ​ Installation The noisereduce filter requires additional dependencies: Copy Ask AI pip install "pipecat-ai[noisereduce]" ​ Constructor Parameters This filter has no configurable parameters in its constructor. ​ Input Frames ​ FilterEnableFrame Frame Specific control frame to toggle filtering on/off Copy Ask AI from pipecat.frames.frames import FilterEnableFrame # Disable noise reduction await task.queue_frame(FilterEnableFrame( False )) # Re-enable noise reduction await task.queue_frame(FilterEnableFrame( True )) ​ Usage Example Copy Ask AI from pipecat.audio.filters.noisereduce_filter import NoisereduceFilter transport = DailyTransport( room_url, token, "Respond bot" , DailyParams( audio_in_filter = NoisereduceFilter(), # Enable noise reduction audio_in_enabled = True , audio_out_enabled = True , vad_analyzer = SileroVADAnalyzer(), ), ) ​ Audio Flow ​ Notes Lightweight alternative to Krisp for noise reduction Supports real-time audio processing Handles PCM_16 audio format Thread-safe for pipeline processing Can be dynamically enabled/disabled No additional configuration required Uses statistical noise reduction techniques KrispFilter SileroVADAnalyzer On this page Overview Installation Constructor Parameters Input Frames Usage Example Audio Flow Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_silero-vad-analyzer_6edfde18.txt b/audio_silero-vad-analyzer_6edfde18.txt new file mode 100644 index 0000000000000000000000000000000000000000..694178572e5a96bd7ad3d6261d52639186d79109 --- /dev/null +++ b/audio_silero-vad-analyzer_6edfde18.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer#constructor-parameters +Title: SileroVADAnalyzer - Pipecat +================================================== + +SileroVADAnalyzer - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing SileroVADAnalyzer Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview SileroVADAnalyzer is a Voice Activity Detection (VAD) analyzer that uses the Silero VAD ONNX model to detect speech in audio streams. It provides high-accuracy speech detection with efficient processing using ONNX runtime. ​ Installation The Silero VAD analyzer requires additional dependencies: Copy Ask AI pip install "pipecat-ai[silero]" ​ Constructor Parameters ​ sample_rate int default: "None" Audio sample rate in Hz. Must be either 8000 or 16000. ​ params VADParams default: "VADParams()" Voice Activity Detection parameters object Show properties ​ confidence float default: "0.7" Confidence threshold for speech detection. Higher values make detection more strict. Must be between 0 and 1. ​ start_secs float default: "0.2" Time in seconds that speech must be detected before transitioning to SPEAKING state. ​ stop_secs float default: "0.8" Time in seconds of silence required before transitioning back to QUIET state. ​ min_volume float default: "0.6" Minimum audio volume threshold for speech detection. Must be between 0 and 1. ​ Usage Example Copy Ask AI transport = DailyTransport( room_url, token, "Respond bot" , DailyParams( audio_in_enabled = True , audio_out_enabled = True , vad_analyzer = SileroVADAnalyzer( params = VADParams( stop_secs = 0.5 )), ), ) ​ Technical Details ​ Sample Rate Requirements The analyzer supports two sample rates: 8000 Hz (256 samples per frame) 16000 Hz (512 samples per frame) Model Management Uses ONNX runtime for efficient inference Automatically resets model state every 5 seconds to manage memory Runs on CPU by default for consistent performance Includes built-in model file ​ Notes High-accuracy speech detection Efficient ONNX-based processing Automatic memory management Thread-safe for pipeline processing Built-in model file included CPU-optimized inference Supports 8kHz and 16kHz audio NoisereduceFilter SoundfileMixer On this page Overview Installation Constructor Parameters Usage Example Technical Details Sample Rate Requirements Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_silero-vad-analyzer_9d32f47f.txt b/audio_silero-vad-analyzer_9d32f47f.txt new file mode 100644 index 0000000000000000000000000000000000000000..33925fab1616093a5089e7d4d813fff3bd7a2e9d --- /dev/null +++ b/audio_silero-vad-analyzer_9d32f47f.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer#sample-rate-requirements +Title: SileroVADAnalyzer - Pipecat +================================================== + +SileroVADAnalyzer - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing SileroVADAnalyzer Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview SileroVADAnalyzer is a Voice Activity Detection (VAD) analyzer that uses the Silero VAD ONNX model to detect speech in audio streams. It provides high-accuracy speech detection with efficient processing using ONNX runtime. ​ Installation The Silero VAD analyzer requires additional dependencies: Copy Ask AI pip install "pipecat-ai[silero]" ​ Constructor Parameters ​ sample_rate int default: "None" Audio sample rate in Hz. Must be either 8000 or 16000. ​ params VADParams default: "VADParams()" Voice Activity Detection parameters object Show properties ​ confidence float default: "0.7" Confidence threshold for speech detection. Higher values make detection more strict. Must be between 0 and 1. ​ start_secs float default: "0.2" Time in seconds that speech must be detected before transitioning to SPEAKING state. ​ stop_secs float default: "0.8" Time in seconds of silence required before transitioning back to QUIET state. ​ min_volume float default: "0.6" Minimum audio volume threshold for speech detection. Must be between 0 and 1. ​ Usage Example Copy Ask AI transport = DailyTransport( room_url, token, "Respond bot" , DailyParams( audio_in_enabled = True , audio_out_enabled = True , vad_analyzer = SileroVADAnalyzer( params = VADParams( stop_secs = 0.5 )), ), ) ​ Technical Details ​ Sample Rate Requirements The analyzer supports two sample rates: 8000 Hz (256 samples per frame) 16000 Hz (512 samples per frame) Model Management Uses ONNX runtime for efficient inference Automatically resets model state every 5 seconds to manage memory Runs on CPU by default for consistent performance Includes built-in model file ​ Notes High-accuracy speech detection Efficient ONNX-based processing Automatic memory management Thread-safe for pipeline processing Built-in model file included CPU-optimized inference Supports 8kHz and 16kHz audio NoisereduceFilter SoundfileMixer On this page Overview Installation Constructor Parameters Usage Example Technical Details Sample Rate Requirements Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_silero-vad-analyzer_a72e4ef6.txt b/audio_silero-vad-analyzer_a72e4ef6.txt new file mode 100644 index 0000000000000000000000000000000000000000..18102dd3ce76518a0709c408a2e46c9dde9a8ffa --- /dev/null +++ b/audio_silero-vad-analyzer_a72e4ef6.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer#technical-details +Title: SileroVADAnalyzer - Pipecat +================================================== + +SileroVADAnalyzer - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing SileroVADAnalyzer Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview SileroVADAnalyzer is a Voice Activity Detection (VAD) analyzer that uses the Silero VAD ONNX model to detect speech in audio streams. It provides high-accuracy speech detection with efficient processing using ONNX runtime. ​ Installation The Silero VAD analyzer requires additional dependencies: Copy Ask AI pip install "pipecat-ai[silero]" ​ Constructor Parameters ​ sample_rate int default: "None" Audio sample rate in Hz. Must be either 8000 or 16000. ​ params VADParams default: "VADParams()" Voice Activity Detection parameters object Show properties ​ confidence float default: "0.7" Confidence threshold for speech detection. Higher values make detection more strict. Must be between 0 and 1. ​ start_secs float default: "0.2" Time in seconds that speech must be detected before transitioning to SPEAKING state. ​ stop_secs float default: "0.8" Time in seconds of silence required before transitioning back to QUIET state. ​ min_volume float default: "0.6" Minimum audio volume threshold for speech detection. Must be between 0 and 1. ​ Usage Example Copy Ask AI transport = DailyTransport( room_url, token, "Respond bot" , DailyParams( audio_in_enabled = True , audio_out_enabled = True , vad_analyzer = SileroVADAnalyzer( params = VADParams( stop_secs = 0.5 )), ), ) ​ Technical Details ​ Sample Rate Requirements The analyzer supports two sample rates: 8000 Hz (256 samples per frame) 16000 Hz (512 samples per frame) Model Management Uses ONNX runtime for efficient inference Automatically resets model state every 5 seconds to manage memory Runs on CPU by default for consistent performance Includes built-in model file ​ Notes High-accuracy speech detection Efficient ONNX-based processing Automatic memory management Thread-safe for pipeline processing Built-in model file included CPU-optimized inference Supports 8kHz and 16kHz audio NoisereduceFilter SoundfileMixer On this page Overview Installation Constructor Parameters Usage Example Technical Details Sample Rate Requirements Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_silero-vad-analyzer_a747eddc.txt b/audio_silero-vad-analyzer_a747eddc.txt new file mode 100644 index 0000000000000000000000000000000000000000..37d4fa3bd4577558f0d6b1996f84f2d57a601190 --- /dev/null +++ b/audio_silero-vad-analyzer_a747eddc.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer#usage-example +Title: SileroVADAnalyzer - Pipecat +================================================== + +SileroVADAnalyzer - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing SileroVADAnalyzer Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview SileroVADAnalyzer is a Voice Activity Detection (VAD) analyzer that uses the Silero VAD ONNX model to detect speech in audio streams. It provides high-accuracy speech detection with efficient processing using ONNX runtime. ​ Installation The Silero VAD analyzer requires additional dependencies: Copy Ask AI pip install "pipecat-ai[silero]" ​ Constructor Parameters ​ sample_rate int default: "None" Audio sample rate in Hz. Must be either 8000 or 16000. ​ params VADParams default: "VADParams()" Voice Activity Detection parameters object Show properties ​ confidence float default: "0.7" Confidence threshold for speech detection. Higher values make detection more strict. Must be between 0 and 1. ​ start_secs float default: "0.2" Time in seconds that speech must be detected before transitioning to SPEAKING state. ​ stop_secs float default: "0.8" Time in seconds of silence required before transitioning back to QUIET state. ​ min_volume float default: "0.6" Minimum audio volume threshold for speech detection. Must be between 0 and 1. ​ Usage Example Copy Ask AI transport = DailyTransport( room_url, token, "Respond bot" , DailyParams( audio_in_enabled = True , audio_out_enabled = True , vad_analyzer = SileroVADAnalyzer( params = VADParams( stop_secs = 0.5 )), ), ) ​ Technical Details ​ Sample Rate Requirements The analyzer supports two sample rates: 8000 Hz (256 samples per frame) 16000 Hz (512 samples per frame) Model Management Uses ONNX runtime for efficient inference Automatically resets model state every 5 seconds to manage memory Runs on CPU by default for consistent performance Includes built-in model file ​ Notes High-accuracy speech detection Efficient ONNX-based processing Automatic memory management Thread-safe for pipeline processing Built-in model file included CPU-optimized inference Supports 8kHz and 16kHz audio NoisereduceFilter SoundfileMixer On this page Overview Installation Constructor Parameters Usage Example Technical Details Sample Rate Requirements Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_soundfile-mixer_0e8c44c2.txt b/audio_soundfile-mixer_0e8c44c2.txt new file mode 100644 index 0000000000000000000000000000000000000000..ace858e82827ba6a042286f130fbefb47eb62a6d --- /dev/null +++ b/audio_soundfile-mixer_0e8c44c2.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/soundfile-mixer#overview +Title: SoundfileMixer - Pipecat +================================================== + +SoundfileMixer - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing SoundfileMixer Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview SoundfileMixer is an audio mixer that combines incoming audio with audio from files. It supports multiple audio file formats through the soundfile library and can handle runtime volume adjustments and sound switching. ​ Installation The soundfile mixer requires additional dependencies: Copy Ask AI pip install "pipecat-ai[soundfile]" ​ Constructor Parameters ​ sound_files Mapping[str, str] required Dictionary mapping sound names to file paths. Files must be mono (single channel). ​ default_sound str required Name of the default sound to play (must be a key in sound_files). ​ volume float default: "0.4" Initial volume for the mixed sound. Values typically range from 0.0 to 1.0, but can go higher. ​ loop bool default: "true" Whether to loop the sound file when it reaches the end. ​ Control Frames ​ MixerUpdateSettingsFrame Frame Updates mixer settings at runtime Show properties ​ sound str Changes the current playing sound (must be a key in sound_files) ​ volume float Updates the mixing volume ​ loop bool Updates whether the sound should loop ​ MixerEnableFrame Frame Enables or disables the mixer Show properties ​ enable bool Whether mixing should be enabled ​ Usage Example Copy Ask AI # Initialize mixer with sound files mixer = SoundfileMixer( sound_files = { "office" : "office_ambience.wav" }, default_sound = "office" , volume = 2.0 , ) # Add to transport transport = DailyTransport( room_url, token, "Audio Bot" , DailyParams( audio_out_enabled = True , audio_out_mixer = mixer, ), ) # Control mixer at runtime await task.queue_frame(MixerUpdateSettingsFrame({ "volume" : 0.5 })) await task.queue_frame(MixerEnableFrame( False )) # Disable mixing await task.queue_frame(MixerEnableFrame( True )) # Enable mixing ​ Notes Supports any audio format that soundfile can read Automatically resamples audio files to match output sample rate Files must be mono (single channel) Thread-safe for pipeline processing Can dynamically switch between multiple sound files Volume can be adjusted in real-time Mixing can be enabled/disabled on demand SileroVADAnalyzer FrameFilter On this page Overview Installation Constructor Parameters Control Frames Usage Example Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_soundfile-mixer_14467662.txt b/audio_soundfile-mixer_14467662.txt new file mode 100644 index 0000000000000000000000000000000000000000..a674a51e0e5a739c1b0812671329679a68512a09 --- /dev/null +++ b/audio_soundfile-mixer_14467662.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/soundfile-mixer#param-loop +Title: SoundfileMixer - Pipecat +================================================== + +SoundfileMixer - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing SoundfileMixer Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview SoundfileMixer is an audio mixer that combines incoming audio with audio from files. It supports multiple audio file formats through the soundfile library and can handle runtime volume adjustments and sound switching. ​ Installation The soundfile mixer requires additional dependencies: Copy Ask AI pip install "pipecat-ai[soundfile]" ​ Constructor Parameters ​ sound_files Mapping[str, str] required Dictionary mapping sound names to file paths. Files must be mono (single channel). ​ default_sound str required Name of the default sound to play (must be a key in sound_files). ​ volume float default: "0.4" Initial volume for the mixed sound. Values typically range from 0.0 to 1.0, but can go higher. ​ loop bool default: "true" Whether to loop the sound file when it reaches the end. ​ Control Frames ​ MixerUpdateSettingsFrame Frame Updates mixer settings at runtime Show properties ​ sound str Changes the current playing sound (must be a key in sound_files) ​ volume float Updates the mixing volume ​ loop bool Updates whether the sound should loop ​ MixerEnableFrame Frame Enables or disables the mixer Show properties ​ enable bool Whether mixing should be enabled ​ Usage Example Copy Ask AI # Initialize mixer with sound files mixer = SoundfileMixer( sound_files = { "office" : "office_ambience.wav" }, default_sound = "office" , volume = 2.0 , ) # Add to transport transport = DailyTransport( room_url, token, "Audio Bot" , DailyParams( audio_out_enabled = True , audio_out_mixer = mixer, ), ) # Control mixer at runtime await task.queue_frame(MixerUpdateSettingsFrame({ "volume" : 0.5 })) await task.queue_frame(MixerEnableFrame( False )) # Disable mixing await task.queue_frame(MixerEnableFrame( True )) # Enable mixing ​ Notes Supports any audio format that soundfile can read Automatically resamples audio files to match output sample rate Files must be mono (single channel) Thread-safe for pipeline processing Can dynamically switch between multiple sound files Volume can be adjusted in real-time Mixing can be enabled/disabled on demand SileroVADAnalyzer FrameFilter On this page Overview Installation Constructor Parameters Control Frames Usage Example Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_soundfile-mixer_1684ce92.txt b/audio_soundfile-mixer_1684ce92.txt new file mode 100644 index 0000000000000000000000000000000000000000..bc2449187f4c34bce16066ad30c5e6c8afe47c39 --- /dev/null +++ b/audio_soundfile-mixer_1684ce92.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/soundfile-mixer#param-sound-files +Title: SoundfileMixer - Pipecat +================================================== + +SoundfileMixer - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing SoundfileMixer Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview SoundfileMixer is an audio mixer that combines incoming audio with audio from files. It supports multiple audio file formats through the soundfile library and can handle runtime volume adjustments and sound switching. ​ Installation The soundfile mixer requires additional dependencies: Copy Ask AI pip install "pipecat-ai[soundfile]" ​ Constructor Parameters ​ sound_files Mapping[str, str] required Dictionary mapping sound names to file paths. Files must be mono (single channel). ​ default_sound str required Name of the default sound to play (must be a key in sound_files). ​ volume float default: "0.4" Initial volume for the mixed sound. Values typically range from 0.0 to 1.0, but can go higher. ​ loop bool default: "true" Whether to loop the sound file when it reaches the end. ​ Control Frames ​ MixerUpdateSettingsFrame Frame Updates mixer settings at runtime Show properties ​ sound str Changes the current playing sound (must be a key in sound_files) ​ volume float Updates the mixing volume ​ loop bool Updates whether the sound should loop ​ MixerEnableFrame Frame Enables or disables the mixer Show properties ​ enable bool Whether mixing should be enabled ​ Usage Example Copy Ask AI # Initialize mixer with sound files mixer = SoundfileMixer( sound_files = { "office" : "office_ambience.wav" }, default_sound = "office" , volume = 2.0 , ) # Add to transport transport = DailyTransport( room_url, token, "Audio Bot" , DailyParams( audio_out_enabled = True , audio_out_mixer = mixer, ), ) # Control mixer at runtime await task.queue_frame(MixerUpdateSettingsFrame({ "volume" : 0.5 })) await task.queue_frame(MixerEnableFrame( False )) # Disable mixing await task.queue_frame(MixerEnableFrame( True )) # Enable mixing ​ Notes Supports any audio format that soundfile can read Automatically resamples audio files to match output sample rate Files must be mono (single channel) Thread-safe for pipeline processing Can dynamically switch between multiple sound files Volume can be adjusted in real-time Mixing can be enabled/disabled on demand SileroVADAnalyzer FrameFilter On this page Overview Installation Constructor Parameters Control Frames Usage Example Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_soundfile-mixer_49840d70.txt b/audio_soundfile-mixer_49840d70.txt new file mode 100644 index 0000000000000000000000000000000000000000..570c73b91d49f3b12212d7634890b20e5fe200ac --- /dev/null +++ b/audio_soundfile-mixer_49840d70.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/soundfile-mixer#param-enable +Title: SoundfileMixer - Pipecat +================================================== + +SoundfileMixer - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing SoundfileMixer Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview SoundfileMixer is an audio mixer that combines incoming audio with audio from files. It supports multiple audio file formats through the soundfile library and can handle runtime volume adjustments and sound switching. ​ Installation The soundfile mixer requires additional dependencies: Copy Ask AI pip install "pipecat-ai[soundfile]" ​ Constructor Parameters ​ sound_files Mapping[str, str] required Dictionary mapping sound names to file paths. Files must be mono (single channel). ​ default_sound str required Name of the default sound to play (must be a key in sound_files). ​ volume float default: "0.4" Initial volume for the mixed sound. Values typically range from 0.0 to 1.0, but can go higher. ​ loop bool default: "true" Whether to loop the sound file when it reaches the end. ​ Control Frames ​ MixerUpdateSettingsFrame Frame Updates mixer settings at runtime Show properties ​ sound str Changes the current playing sound (must be a key in sound_files) ​ volume float Updates the mixing volume ​ loop bool Updates whether the sound should loop ​ MixerEnableFrame Frame Enables or disables the mixer Show properties ​ enable bool Whether mixing should be enabled ​ Usage Example Copy Ask AI # Initialize mixer with sound files mixer = SoundfileMixer( sound_files = { "office" : "office_ambience.wav" }, default_sound = "office" , volume = 2.0 , ) # Add to transport transport = DailyTransport( room_url, token, "Audio Bot" , DailyParams( audio_out_enabled = True , audio_out_mixer = mixer, ), ) # Control mixer at runtime await task.queue_frame(MixerUpdateSettingsFrame({ "volume" : 0.5 })) await task.queue_frame(MixerEnableFrame( False )) # Disable mixing await task.queue_frame(MixerEnableFrame( True )) # Enable mixing ​ Notes Supports any audio format that soundfile can read Automatically resamples audio files to match output sample rate Files must be mono (single channel) Thread-safe for pipeline processing Can dynamically switch between multiple sound files Volume can be adjusted in real-time Mixing can be enabled/disabled on demand SileroVADAnalyzer FrameFilter On this page Overview Installation Constructor Parameters Control Frames Usage Example Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_soundfile-mixer_5aa2a166.txt b/audio_soundfile-mixer_5aa2a166.txt new file mode 100644 index 0000000000000000000000000000000000000000..1dab5663e7ab45c2b625e017dde27f2db1f06c65 --- /dev/null +++ b/audio_soundfile-mixer_5aa2a166.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/soundfile-mixer#param-loop-1 +Title: SoundfileMixer - Pipecat +================================================== + +SoundfileMixer - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing SoundfileMixer Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview SoundfileMixer is an audio mixer that combines incoming audio with audio from files. It supports multiple audio file formats through the soundfile library and can handle runtime volume adjustments and sound switching. ​ Installation The soundfile mixer requires additional dependencies: Copy Ask AI pip install "pipecat-ai[soundfile]" ​ Constructor Parameters ​ sound_files Mapping[str, str] required Dictionary mapping sound names to file paths. Files must be mono (single channel). ​ default_sound str required Name of the default sound to play (must be a key in sound_files). ​ volume float default: "0.4" Initial volume for the mixed sound. Values typically range from 0.0 to 1.0, but can go higher. ​ loop bool default: "true" Whether to loop the sound file when it reaches the end. ​ Control Frames ​ MixerUpdateSettingsFrame Frame Updates mixer settings at runtime Show properties ​ sound str Changes the current playing sound (must be a key in sound_files) ​ volume float Updates the mixing volume ​ loop bool Updates whether the sound should loop ​ MixerEnableFrame Frame Enables or disables the mixer Show properties ​ enable bool Whether mixing should be enabled ​ Usage Example Copy Ask AI # Initialize mixer with sound files mixer = SoundfileMixer( sound_files = { "office" : "office_ambience.wav" }, default_sound = "office" , volume = 2.0 , ) # Add to transport transport = DailyTransport( room_url, token, "Audio Bot" , DailyParams( audio_out_enabled = True , audio_out_mixer = mixer, ), ) # Control mixer at runtime await task.queue_frame(MixerUpdateSettingsFrame({ "volume" : 0.5 })) await task.queue_frame(MixerEnableFrame( False )) # Disable mixing await task.queue_frame(MixerEnableFrame( True )) # Enable mixing ​ Notes Supports any audio format that soundfile can read Automatically resamples audio files to match output sample rate Files must be mono (single channel) Thread-safe for pipeline processing Can dynamically switch between multiple sound files Volume can be adjusted in real-time Mixing can be enabled/disabled on demand SileroVADAnalyzer FrameFilter On this page Overview Installation Constructor Parameters Control Frames Usage Example Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/audio_soundfile-mixer_d2c38853.txt b/audio_soundfile-mixer_d2c38853.txt new file mode 100644 index 0000000000000000000000000000000000000000..53697a49ea4bb4a6c9e7361d298b5f87b8881bbb --- /dev/null +++ b/audio_soundfile-mixer_d2c38853.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/audio/soundfile-mixer#constructor-parameters +Title: SoundfileMixer - Pipecat +================================================== + +SoundfileMixer - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Audio Processing SoundfileMixer Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing AudioBufferProcessor KoalaFilter KrispFilter NoisereduceFilter SileroVADAnalyzer SoundfileMixer Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview SoundfileMixer is an audio mixer that combines incoming audio with audio from files. It supports multiple audio file formats through the soundfile library and can handle runtime volume adjustments and sound switching. ​ Installation The soundfile mixer requires additional dependencies: Copy Ask AI pip install "pipecat-ai[soundfile]" ​ Constructor Parameters ​ sound_files Mapping[str, str] required Dictionary mapping sound names to file paths. Files must be mono (single channel). ​ default_sound str required Name of the default sound to play (must be a key in sound_files). ​ volume float default: "0.4" Initial volume for the mixed sound. Values typically range from 0.0 to 1.0, but can go higher. ​ loop bool default: "true" Whether to loop the sound file when it reaches the end. ​ Control Frames ​ MixerUpdateSettingsFrame Frame Updates mixer settings at runtime Show properties ​ sound str Changes the current playing sound (must be a key in sound_files) ​ volume float Updates the mixing volume ​ loop bool Updates whether the sound should loop ​ MixerEnableFrame Frame Enables or disables the mixer Show properties ​ enable bool Whether mixing should be enabled ​ Usage Example Copy Ask AI # Initialize mixer with sound files mixer = SoundfileMixer( sound_files = { "office" : "office_ambience.wav" }, default_sound = "office" , volume = 2.0 , ) # Add to transport transport = DailyTransport( room_url, token, "Audio Bot" , DailyParams( audio_out_enabled = True , audio_out_mixer = mixer, ), ) # Control mixer at runtime await task.queue_frame(MixerUpdateSettingsFrame({ "volume" : 0.5 })) await task.queue_frame(MixerEnableFrame( False )) # Disable mixing await task.queue_frame(MixerEnableFrame( True )) # Enable mixing ​ Notes Supports any audio format that soundfile can read Automatically resamples audio files to match output sample rate Files must be mono (single channel) Thread-safe for pipeline processing Can dynamically switch between multiple sound files Volume can be adjusted in real-time Mixing can be enabled/disabled on demand SileroVADAnalyzer FrameFilter On this page Overview Installation Constructor Parameters Control Frames Usage Example Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/base-classes_media_af1c223c.txt b/base-classes_media_af1c223c.txt new file mode 100644 index 0000000000000000000000000000000000000000..017679ef8769a947b6606cbab0d5ccae26273331 --- /dev/null +++ b/base-classes_media_af1c223c.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/base-classes/media#join-our-community +Title: Overview - Pipecat +================================================== + +Overview - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Get Started Overview Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Get Started Overview Installation & Setup Quickstart Core Concepts Next Steps & Examples Pipecat is an open source Python framework that handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions. “Multimodal” means you can use any combination of audio, video, images, and/or text in your interactions. And “real-time” means that things are happening quickly enough that it feels conversational—a “back-and-forth” with a bot, not submitting a query and waiting for results. ​ What You Can Build Voice Assistants Natural, real-time conversations with AI using speech recognition and synthesis Interactive Agents Personal coaches and meeting assistants that can understand context and provide guidance Multimodal Apps Applications that combine voice, video, images, and text for rich interactions Creative Tools Storytelling experiences and social companions that engage users Business Solutions Customer intake flows and support bots for automated business processes Complex Flows Structured conversations using Pipecat Flows for managing complex interactions ​ How It Works The flow of interactions in a Pipecat application is typically straightforward: The bot says something The user says something The bot says something The user says something This continues until the conversation naturally ends. While this flow seems simple, making it feel natural requires sophisticated real-time processing. ​ Real-time Processing Pipecat’s pipeline architecture handles both simple voice interactions and complex multimodal processing. Let’s look at how data flows through the system: Voice app Multimodal app 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio and Video Transmit and capture audio, video, and image inputs simultaneously 2 Process Streams Handle multiple input streams in parallel 3 Model Processing Send combined inputs to multimodal models (like GPT-4V) 4 Generate Outputs Create various outputs (text, images, audio, etc.) 5 Coordinate Presentation Synchronize and present multiple output types In both cases, Pipecat: Processes responses as they stream in Handles multiple input/output modalities concurrently Manages resource allocation and synchronization Coordinates parallel processing tasks This architecture creates fluid, natural interactions without noticeable delays, whether you’re building a simple voice assistant or a complex multimodal application. Pipecat’s pipeline architecture is particularly valuable for managing the complexity of real-time, multimodal interactions, ensuring smooth data flow and proper synchronization regardless of the input/output types involved. Pipecat handles all this complexity for you, letting you focus on building your application rather than managing the underlying infrastructure. ​ Next Steps Ready to build your first Pipecat application? Installation & Setup Prepare your environment and install required dependencies Quickstart Build and run your first Pipecat application Core Concepts Learn about pipelines, frames, and real-time processing Use Cases Explore example implementations and patterns ​ Join Our Community Discord Community Connect with other developers, share your projects, and get support from the Pipecat team. Installation & Setup On this page What You Can Build How It Works Real-time Processing Next Steps Join Our Community Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/base-classes_media_db6c6ed3.txt b/base-classes_media_db6c6ed3.txt new file mode 100644 index 0000000000000000000000000000000000000000..002649bb469432b81b8ee7b796c382c1a492c546 --- /dev/null +++ b/base-classes_media_db6c6ed3.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/base-classes/media#methods +Title: Overview - Pipecat +================================================== + +Overview - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Get Started Overview Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Get Started Overview Installation & Setup Quickstart Core Concepts Next Steps & Examples Pipecat is an open source Python framework that handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions. “Multimodal” means you can use any combination of audio, video, images, and/or text in your interactions. And “real-time” means that things are happening quickly enough that it feels conversational—a “back-and-forth” with a bot, not submitting a query and waiting for results. ​ What You Can Build Voice Assistants Natural, real-time conversations with AI using speech recognition and synthesis Interactive Agents Personal coaches and meeting assistants that can understand context and provide guidance Multimodal Apps Applications that combine voice, video, images, and text for rich interactions Creative Tools Storytelling experiences and social companions that engage users Business Solutions Customer intake flows and support bots for automated business processes Complex Flows Structured conversations using Pipecat Flows for managing complex interactions ​ How It Works The flow of interactions in a Pipecat application is typically straightforward: The bot says something The user says something The bot says something The user says something This continues until the conversation naturally ends. While this flow seems simple, making it feel natural requires sophisticated real-time processing. ​ Real-time Processing Pipecat’s pipeline architecture handles both simple voice interactions and complex multimodal processing. Let’s look at how data flows through the system: Voice app Multimodal app 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio and Video Transmit and capture audio, video, and image inputs simultaneously 2 Process Streams Handle multiple input streams in parallel 3 Model Processing Send combined inputs to multimodal models (like GPT-4V) 4 Generate Outputs Create various outputs (text, images, audio, etc.) 5 Coordinate Presentation Synchronize and present multiple output types In both cases, Pipecat: Processes responses as they stream in Handles multiple input/output modalities concurrently Manages resource allocation and synchronization Coordinates parallel processing tasks This architecture creates fluid, natural interactions without noticeable delays, whether you’re building a simple voice assistant or a complex multimodal application. Pipecat’s pipeline architecture is particularly valuable for managing the complexity of real-time, multimodal interactions, ensuring smooth data flow and proper synchronization regardless of the input/output types involved. Pipecat handles all this complexity for you, letting you focus on building your application rather than managing the underlying infrastructure. ​ Next Steps Ready to build your first Pipecat application? Installation & Setup Prepare your environment and install required dependencies Quickstart Build and run your first Pipecat application Core Concepts Learn about pipelines, frames, and real-time processing Use Cases Explore example implementations and patterns ​ Join Our Community Discord Community Connect with other developers, share your projects, and get support from the Pipecat team. Installation & Setup On this page What You Can Build How It Works Real-time Processing Next Steps Join Our Community Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/base-classes_speech_afdfa0be.txt b/base-classes_speech_afdfa0be.txt new file mode 100644 index 0000000000000000000000000000000000000000..6ebd885e7fa87f9d1c0680bd81927c24cc6006b3 --- /dev/null +++ b/base-classes_speech_afdfa0be.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/base-classes/speech#real-time-processing +Title: Overview - Pipecat +================================================== + +Overview - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Get Started Overview Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Get Started Overview Installation & Setup Quickstart Core Concepts Next Steps & Examples Pipecat is an open source Python framework that handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions. “Multimodal” means you can use any combination of audio, video, images, and/or text in your interactions. And “real-time” means that things are happening quickly enough that it feels conversational—a “back-and-forth” with a bot, not submitting a query and waiting for results. ​ What You Can Build Voice Assistants Natural, real-time conversations with AI using speech recognition and synthesis Interactive Agents Personal coaches and meeting assistants that can understand context and provide guidance Multimodal Apps Applications that combine voice, video, images, and text for rich interactions Creative Tools Storytelling experiences and social companions that engage users Business Solutions Customer intake flows and support bots for automated business processes Complex Flows Structured conversations using Pipecat Flows for managing complex interactions ​ How It Works The flow of interactions in a Pipecat application is typically straightforward: The bot says something The user says something The bot says something The user says something This continues until the conversation naturally ends. While this flow seems simple, making it feel natural requires sophisticated real-time processing. ​ Real-time Processing Pipecat’s pipeline architecture handles both simple voice interactions and complex multimodal processing. Let’s look at how data flows through the system: Voice app Multimodal app 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio and Video Transmit and capture audio, video, and image inputs simultaneously 2 Process Streams Handle multiple input streams in parallel 3 Model Processing Send combined inputs to multimodal models (like GPT-4V) 4 Generate Outputs Create various outputs (text, images, audio, etc.) 5 Coordinate Presentation Synchronize and present multiple output types In both cases, Pipecat: Processes responses as they stream in Handles multiple input/output modalities concurrently Manages resource allocation and synchronization Coordinates parallel processing tasks This architecture creates fluid, natural interactions without noticeable delays, whether you’re building a simple voice assistant or a complex multimodal application. Pipecat’s pipeline architecture is particularly valuable for managing the complexity of real-time, multimodal interactions, ensuring smooth data flow and proper synchronization regardless of the input/output types involved. Pipecat handles all this complexity for you, letting you focus on building your application rather than managing the underlying infrastructure. ​ Next Steps Ready to build your first Pipecat application? Installation & Setup Prepare your environment and install required dependencies Quickstart Build and run your first Pipecat application Core Concepts Learn about pipelines, frames, and real-time processing Use Cases Explore example implementations and patterns ​ Join Our Community Discord Community Connect with other developers, share your projects, and get support from the Pipecat team. Installation & Setup On this page What You Can Build How It Works Real-time Processing Next Steps Join Our Community Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/base-classes_text_1252cc3e.txt b/base-classes_text_1252cc3e.txt new file mode 100644 index 0000000000000000000000000000000000000000..8cfa341a133dba67a11623a071a4f97e43a60227 --- /dev/null +++ b/base-classes_text_1252cc3e.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/base-classes/text#real-time-processing +Title: Overview - Pipecat +================================================== + +Overview - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Get Started Overview Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Get Started Overview Installation & Setup Quickstart Core Concepts Next Steps & Examples Pipecat is an open source Python framework that handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions. “Multimodal” means you can use any combination of audio, video, images, and/or text in your interactions. And “real-time” means that things are happening quickly enough that it feels conversational—a “back-and-forth” with a bot, not submitting a query and waiting for results. ​ What You Can Build Voice Assistants Natural, real-time conversations with AI using speech recognition and synthesis Interactive Agents Personal coaches and meeting assistants that can understand context and provide guidance Multimodal Apps Applications that combine voice, video, images, and text for rich interactions Creative Tools Storytelling experiences and social companions that engage users Business Solutions Customer intake flows and support bots for automated business processes Complex Flows Structured conversations using Pipecat Flows for managing complex interactions ​ How It Works The flow of interactions in a Pipecat application is typically straightforward: The bot says something The user says something The bot says something The user says something This continues until the conversation naturally ends. While this flow seems simple, making it feel natural requires sophisticated real-time processing. ​ Real-time Processing Pipecat’s pipeline architecture handles both simple voice interactions and complex multimodal processing. Let’s look at how data flows through the system: Voice app Multimodal app 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio and Video Transmit and capture audio, video, and image inputs simultaneously 2 Process Streams Handle multiple input streams in parallel 3 Model Processing Send combined inputs to multimodal models (like GPT-4V) 4 Generate Outputs Create various outputs (text, images, audio, etc.) 5 Coordinate Presentation Synchronize and present multiple output types In both cases, Pipecat: Processes responses as they stream in Handles multiple input/output modalities concurrently Manages resource allocation and synchronization Coordinates parallel processing tasks This architecture creates fluid, natural interactions without noticeable delays, whether you’re building a simple voice assistant or a complex multimodal application. Pipecat’s pipeline architecture is particularly valuable for managing the complexity of real-time, multimodal interactions, ensuring smooth data flow and proper synchronization regardless of the input/output types involved. Pipecat handles all this complexity for you, letting you focus on building your application rather than managing the underlying infrastructure. ​ Next Steps Ready to build your first Pipecat application? Installation & Setup Prepare your environment and install required dependencies Quickstart Build and run your first Pipecat application Core Concepts Learn about pipelines, frames, and real-time processing Use Cases Explore example implementations and patterns ​ Join Our Community Discord Community Connect with other developers, share your projects, and get support from the Pipecat team. Installation & Setup On this page What You Can Build How It Works Real-time Processing Next Steps Join Our Community Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/base-classes_text_96b7f57c.txt b/base-classes_text_96b7f57c.txt new file mode 100644 index 0000000000000000000000000000000000000000..a5ab6a250d2e5c5f8fd0ce71fe5bf254c275fc2a --- /dev/null +++ b/base-classes_text_96b7f57c.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/base-classes/text#join-our-community +Title: Overview - Pipecat +================================================== + +Overview - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Get Started Overview Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Get Started Overview Installation & Setup Quickstart Core Concepts Next Steps & Examples Pipecat is an open source Python framework that handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions. “Multimodal” means you can use any combination of audio, video, images, and/or text in your interactions. And “real-time” means that things are happening quickly enough that it feels conversational—a “back-and-forth” with a bot, not submitting a query and waiting for results. ​ What You Can Build Voice Assistants Natural, real-time conversations with AI using speech recognition and synthesis Interactive Agents Personal coaches and meeting assistants that can understand context and provide guidance Multimodal Apps Applications that combine voice, video, images, and text for rich interactions Creative Tools Storytelling experiences and social companions that engage users Business Solutions Customer intake flows and support bots for automated business processes Complex Flows Structured conversations using Pipecat Flows for managing complex interactions ​ How It Works The flow of interactions in a Pipecat application is typically straightforward: The bot says something The user says something The bot says something The user says something This continues until the conversation naturally ends. While this flow seems simple, making it feel natural requires sophisticated real-time processing. ​ Real-time Processing Pipecat’s pipeline architecture handles both simple voice interactions and complex multimodal processing. Let’s look at how data flows through the system: Voice app Multimodal app 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio and Video Transmit and capture audio, video, and image inputs simultaneously 2 Process Streams Handle multiple input streams in parallel 3 Model Processing Send combined inputs to multimodal models (like GPT-4V) 4 Generate Outputs Create various outputs (text, images, audio, etc.) 5 Coordinate Presentation Synchronize and present multiple output types In both cases, Pipecat: Processes responses as they stream in Handles multiple input/output modalities concurrently Manages resource allocation and synchronization Coordinates parallel processing tasks This architecture creates fluid, natural interactions without noticeable delays, whether you’re building a simple voice assistant or a complex multimodal application. Pipecat’s pipeline architecture is particularly valuable for managing the complexity of real-time, multimodal interactions, ensuring smooth data flow and proper synchronization regardless of the input/output types involved. Pipecat handles all this complexity for you, letting you focus on building your application rather than managing the underlying infrastructure. ​ Next Steps Ready to build your first Pipecat application? Installation & Setup Prepare your environment and install required dependencies Quickstart Build and run your first Pipecat application Core Concepts Learn about pipelines, frames, and real-time processing Use Cases Explore example implementations and patterns ​ Join Our Community Discord Community Connect with other developers, share your projects, and get support from the Pipecat team. Installation & Setup On this page What You Can Build How It Works Real-time Processing Next Steps Join Our Community Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/c_introduction_82958e9e.txt b/c_introduction_82958e9e.txt new file mode 100644 index 0000000000000000000000000000000000000000..2b9fe07ff4ae88500f8553f84346ac69110fea03 --- /dev/null +++ b/c_introduction_82958e9e.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/client/c++/introduction#libcurl +Title: SDK Introduction - Pipecat +================================================== + +SDK Introduction - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation C++ SDK SDK Introduction Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Client SDKs The RTVI Standard RTVIClient Migration Guide Javascript SDK SDK Introduction API Reference Transport packages React SDK SDK Introduction API Reference React Native SDK SDK Introduction API Reference iOS SDK SDK Introduction API Reference Transport packages Android SDK SDK Introduction API Reference Transport packages C++ SDK SDK Introduction Daily WebRTC Transport The Pipecat C++ SDK provides a native implementation for building voice and multimodal AI applications. It supports: Linux ( x86_64 and aarch64 ) macOS ( aarch64 ) Windows ( x86_64 ) ​ Dependencies ​ libcurl The SDK uses libcurl for HTTP requests. Linux macOS Windows Copy Ask AI sudo apt-get install libcurl4-openssl-dev Copy Ask AI sudo apt-get install libcurl4-openssl-dev On macOS libcurl is already included so there is nothing to install. On Windows we use vcpkg to install dependencies. You need to set it up following one of the tutorials . The libcurl dependency will be automatically downloaded when building. ​ Installation Build the SDK using CMake: Linux/macOS Windows Copy Ask AI cmake . -G Ninja -Bbuild -DCMAKE_BUILD_TYPE=Release ninja -C build Copy Ask AI cmake . -G Ninja -Bbuild -DCMAKE_BUILD_TYPE=Release ninja -C build Copy Ask AI # Initialize Visual Studio environment "C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Auxiliary\Build\vcvarsall.bat" amd64 # Configure and build cmake . -Bbuild --preset vcpkg cmake --buildbuild --config Release ​ Cross-compilation For Linux aarch64: Copy Ask AI cmake . -G Ninja -Bbuild -DCMAKE_TOOLCHAIN_FILE=aarch64-linux-toolchain.cmake -DCMAKE_BUILD_TYPE=Release ninja -C build ​ Documentation API Reference Complete SDK API documentation Daily Transport WebRTC implementation using Daily Small WebRTC Transport Daily WebRTC Transport On this page Dependencies libcurl Installation Cross-compilation Documentation Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/client_introduction_88cc8f09.txt b/client_introduction_88cc8f09.txt new file mode 100644 index 0000000000000000000000000000000000000000..a8ce72adfffcc14224dc110a346eb5c0f45f4e85 --- /dev/null +++ b/client_introduction_88cc8f09.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/client/introduction#rtvimessage +Title: Client SDKs - Pipecat +================================================== + +Client SDKs - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Client SDKs Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Client SDKs The RTVI Standard RTVIClient Migration Guide Javascript SDK SDK Introduction API Reference Transport packages React SDK SDK Introduction API Reference React Native SDK SDK Introduction API Reference iOS SDK SDK Introduction API Reference Transport packages Android SDK SDK Introduction API Reference Transport packages C++ SDK SDK Introduction Daily WebRTC Transport The Client SDKs are currently in transition to a new, simpler API design. The js and react libraries have already been deployed with these changes. Their corresponding documentation along with this top-level documentation has been updated to reflect the latest changes. For transitioning to the new API, please refer to the migration guide . Note that React Native, iOS, and Android SDKs are still in the process of being updated and their documentation will be updated once the new versions are released. If you have any questions or need assistance, please reach out to us on Discord . Pipecat provides client SDKs for multiple platforms, all implementing the RTVI (Real-Time Voice and Video Inference) standard. These SDKs make it easy to build real-time AI applications that can handle voice, video, and text interactions. Javascript Pipecat JS SDK React Pipecat React SDK React Native Pipecat React Native SDK Swift Pipecat iOS SDK Kotlin Pipecat Android SDK C++ Pipecat C++ SDK ​ Core Functionality All Pipecat client SDKs provide: Media Management Handle device inputs and media streams for audio and video Bot Integration Configure and communicate with your Pipecat bot Session Management Manage connection state and error handling ​ Core Types ​ PipecatClient The main class for interacting with Pipecat bots. It is the primary type you will interact with. ​ Transport The PipecatClient wraps a Transport, which defines and provides the underlying connection mechanism (e.g., WebSocket, WebRTC). Your Pipecat pipeline will contain a corresponding transport. ​ RTVIMessage Represents a message sent to or received from a Pipecat bot. ​ Simple Usage Examples Connecting to a Bot Custom Messaging Establish ongoing connections via WebSocket or WebRTC for: Live voice conversations Real-time video processing Continuous interactions javascript react Copy Ask AI // Example: Establishing a real-time connection import { RTVIEvent , RTVIMessage , PipecatClient } from "@pipecat-ai/client-js" ; import { DailyTransport } from "@pipecat-ai/daily-transport" ; const pcClient = new PipecatClient ({ transport: new DailyTransport (), enableMic: true , enableCam: false , enableScreenShare: false , callbacks: { onBotConnected : () => { console . log ( "[CALLBACK] Bot connected" ); }, onBotDisconnected : () => { console . log ( "[CALLBACK] Bot disconnected" ); }, onBotReady : () => { console . log ( "[CALLBACK] Bot ready to chat!" ); }, }, }); try { // Below, we use a REST endpoint to fetch connection credentials for our // Daily Transport. Alternatively, you could provide those credentials // directly to `connect()`. await pcClient . connect ({ endpoint: "https://your-connect-end-point-here/connect" , }); } catch ( e ) { console . error ( e . message ); } // Events (alternative approach to constructor-provided callbacks) pcClient . on ( RTVIEvent . Connected , () => { console . log ( "[EVENT] User connected" ); }); pcClient . on ( RTVIEvent . Disconnected , () => { console . log ( "[EVENT] User disconnected" ); }); Establish ongoing connections via WebSocket or WebRTC for: Live voice conversations Real-time video processing Continuous interactions javascript react Copy Ask AI // Example: Establishing a real-time connection import { RTVIEvent , RTVIMessage , PipecatClient } from "@pipecat-ai/client-js" ; import { DailyTransport } from "@pipecat-ai/daily-transport" ; const pcClient = new PipecatClient ({ transport: new DailyTransport (), enableMic: true , enableCam: false , enableScreenShare: false , callbacks: { onBotConnected : () => { console . log ( "[CALLBACK] Bot connected" ); }, onBotDisconnected : () => { console . log ( "[CALLBACK] Bot disconnected" ); }, onBotReady : () => { console . log ( "[CALLBACK] Bot ready to chat!" ); }, }, }); try { // Below, we use a REST endpoint to fetch connection credentials for our // Daily Transport. Alternatively, you could provide those credentials // directly to `connect()`. await pcClient . connect ({ endpoint: "https://your-connect-end-point-here/connect" , }); } catch ( e ) { console . error ( e . message ); } // Events (alternative approach to constructor-provided callbacks) pcClient . on ( RTVIEvent . Connected , () => { console . log ( "[EVENT] User connected" ); }); pcClient . on ( RTVIEvent . Disconnected , () => { console . log ( "[EVENT] User disconnected" ); }); Send custom messages and handle responses from your bot. This is useful for: Running server-side functionality Triggering specific bot actions Querying the server Responding to server requests javascript react Copy Ask AI import { PipecatClient } from "@pipecat-ai/client-js" ; const pcClient = new PipecatClient ({ transport: new DailyTransport (), callbacks: { onBotConnected : () => { pcClient . sendClientRequest ( 'get-language' ) . then (( response ) => { console . log ( "[CALLBACK] Bot using language:" , response ); if ( response !== preferredLanguage ) { pcClient . sendClientMessage ( 'set-language' , { language: preferredLanguage }); } }) . catch (( error ) => { console . error ( "[CALLBACK] Error getting language:" , error ); }); }, onServerMessage : ( message ) => { console . log ( "[CALLBACK] Received message from server:" , message ); }, }, }); await pcClient . connect ({ url: "https://your-daily-room-url" , token: "your-daily-token" }); ​ About RTVI Pipecat’s client SDKs implement the RTVI (Real-Time Voice and Video Inference) standard, an open specification for real-time AI inference. This means: Your code can work with any RTVI-compatible inference service You get battle-tested tooling for real-time multimedia handling You can easily set up development and testing environments ​ Next Steps Get started by trying out examples: Simple Chatbot Example Complete client-server example with both bot backend (Python) and frontend implementation (JS, React, React Native, iOS, and Android). More Examples Explore our full collection of example applications and implementations across different platforms and use cases. The RTVI Standard On this page Core Functionality Core Types PipecatClient Transport RTVIMessage Simple Usage Examples About RTVI Next Steps Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/client_rtvi-standard_37662bce.txt b/client_rtvi-standard_37662bce.txt new file mode 100644 index 0000000000000000000000000000000000000000..e02121bf3a46550dc80f9bd838d676aabb70d85d --- /dev/null +++ b/client_rtvi-standard_37662bce.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/client/rtvi-standard#error-response-%F0%9F%A4%96 +Title: The RTVI Standard - Pipecat +================================================== + +The RTVI Standard - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation The RTVI Standard Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Client SDKs The RTVI Standard RTVIClient Migration Guide Javascript SDK SDK Introduction API Reference Transport packages React SDK SDK Introduction API Reference React Native SDK SDK Introduction API Reference iOS SDK SDK Introduction API Reference Transport packages Android SDK SDK Introduction API Reference Transport packages C++ SDK SDK Introduction Daily WebRTC Transport The RTVI (Real-Time Voice and Video Inference) standard defines a set of message types and structures sent between clients and servers. It is designed to facilitate real-time interactions between clients and AI applications that require voice, video, and text communication. It provides a consistent framework for building applications that can communicate with AI models and the backends running those models in real-time. This page documents version 1.0 of the RTVI standard, released in June 2025. ​ Key Features Connection Management RTVI provides a flexible connection model that allows clients to connect to AI services and coordinate state. Transcriptions The standard includes built-in support for real-time transcription of audio streams. Client-Server Messaging The standard defines a messaging protocol for sending and receiving messages between clients and servers, allowing for efficient communication of requests and responses. Advanced LLM Interactions The standard supports advanced interactions with large language models (LLMs), including context management, function call handline, and search results. Service-Specific Insights RTVI supports events to provide insight into the input/output and state for typical services that exist in speech-to-speech workflows. Metrics and Monitoring RTVI provides mechanisms for collecting metrics and monitoring the performance of server-side services. ​ Terms Client : The front-end application or user interface that interacts with the RTVI server. Server : The backend-end service that runs the AI framework and processes requests from the client. User : The end user interacting with the client application. Bot : The AI interacting with the user, technically an amalgamation of a large language model (LLM) and a text-to-speech (TTS) service. ​ RTVI Message Format The messages defined as part of the RTVI protocol adhere to the following format: Copy Ask AI { "id" : string , "label" : "rtvi-ai" , "type" : string , "data" : unknown } ​ id string A unique identifier for the message, used to correlate requests and responses. ​ label string default: "rtvi-ai" required A label that identifies this message as an RTVI message. This field is required and should always be set to 'rtvi-ai' . ​ type string required The type of message being sent. This field is required and should be set to one of the predefined RTVI message types listed below. ​ data unknown The payload of the message, which can be any data structure relevant to the message type. ​ RTVI Message Types Following the above format, this section describes the various message types defined by the RTVI standard. Each message type has a specific purpose and structure, allowing for clear communication between clients and servers. Each message type below includes either a 🤖 or 🏄 emoji to denote whether the message is sent from the bot (🤖) or client (🏄). ​ Connection Management ​ client-ready 🏄 Indicates that the client is ready to receive messages and interact with the server. Typically sent after the transport media channels have connected. type : 'client-ready' data : version : string The version of the RTVI standard being used. This is useful for ensuring compatibility between client and server implementations. about : AboutClient Object An object containing information about the client, such as its rtvi-version, client library, and any other relevant metadata. The AboutClient object follows this structure: Show AboutClient ​ library string required ​ library_version string ​ platform string ​ platform_version string ​ platform_details any Any platform-specific details that may be relevant to the server. This could include information about the browser, operating system, or any other environment-specific data needed by the server. This field is optional and open-ended, so please be mindful of the data you include here and any security concerns that may arise from exposing sensitive or personal-identifiable information. ​ bot-ready 🤖 Indicates that the bot is ready to receive messages and interact with the client. Typically send after the transport media channels have connected. type : 'bot-ready' data : version : string The version of the RTVI standard being used. This is useful for ensuring compatibility between client and server implementations. about : any (Optional) An object containing information about the server or bot. It’s structure and value are both undefined by default. This provides flexibility to include any relevant metadata your client may need to know about the server at connection time, without any built-in security concerns. Please be mindful of the data you include here and any security concerns that may arise from exposing sensitive information. ​ disconnect-bot 🏄 Indicates that the client wishes to disconnect from the bot. Typically used when the client is shutting down or no longer needs to interact with the bot. Note: Disconnets should happen automatically when either the client or bot disconnects from the transport, so this message is intended for the case where a client may want to remain connected to the transport but no longer wishes to interact with the bot. type : 'disconnect-bot' data : undefined ​ error 🤖 Indicates an error occurred during bot initialization or runtime. type : 'error' data : message : string Description of the error. fatal : boolean Indicates if the error is fatal to the session. ​ Transcription ​ user-started-speaking 🤖 Emitted when the user begins speaking type : 'user-started-speaking' data : None ​ user-stopped-speaking 🤖 Emitted when the user stops speaking type : 'user-stopped-speaking' data : None ​ bot-started-speaking 🤖 Emitted when the bot begins speaking type : 'bot-started-speaking' data : None ​ bot-stopped-speaking 🤖 Emitted when the bot stops speaking type : 'bot-stopped-speaking' data : None ​ user-transcription 🤖 Real-time transcription of user speech, including both partial and final results. type : 'user-transcription' data : text : string The transcribed text of the user. final : boolean Indicates if this is a final transcription or a partial result. timestamp : string The timestamp when the transcription was generated. user_id : string Identifier for the user who spoke. ​ bot-transcription 🤖 Transcription of the bot’s speech. Note: This protocol currently does not match the user transcription format to support real-time timestamping for bot transcriptions. Rather, the event is typically sent for each sentence of the bot’s response. This difference is currently due to limitations in TTS services which mostly do not support (or support well), accurate timing information. If/when this changes, this protocol may be updated to include the necessary timing information. For now, if you want to attempt real-time transcription to match your bot’s speaking, you can try using the bot-tts-text message type. type : 'bot-transcription' data : text : string The transcribed text from the bot, typically aggregated at a per-sentence level. ​ Client-Server Messaging ​ server-message 🤖 An arbitrary message sent from the server to the client. This can be used for custom interactions or commands. This message may be coupled with the client-message message type to handle responses from the client. type : 'server-message' data : any The data can be any JSON-serializable object, formatted according to your own specifications. ​ client-message 🏄 An arbitrary message sent from the client to the server. This can be used for custom interactions or commands. This message may be coupled with the server-response message type to handle responses from the server. type : 'client-message' data : t : string d : unknown (optional) The data payload should contain a t field indicating the type of message and an optional d field containing any custom, corresponding data needed for the message. ​ server-response 🤖 An message sent from the server to the client in response to a client-message . IMPORTANT : The id should match the id of the original client-message to correlate the response with the request. type : 'client-message' data : t : string d : unknown (optional) The data payload should contain a t field indicating the type of message and an optional d field containing any custom, corresponding data needed for the message. ​ error-response 🤖 Error response to a specific client message. IMPORTANT : The id should match the id of the original client-message to correlate the response with the request. type : 'error-response' data : error : string ​ Advanced LLM Interactions ​ append-to-context 🏄 A message sent from the client to the server to append data to the context of the current llm conversation. This is useful for providing text-based content for the user or augmenting the context for the assistant. type : 'append-to-context' data : role : "user" | "assistant" The role the context should be appended to. Currently only supports "user" and "assistant" . content : unknown The content to append to the context. This can be any data structure the llm understand. run_immediately : boolean (optional) Indicates whether the context should be run immediately after appending. Defaults to false . If set to false , the context will be appended but not executed until the next llm run. ​ llm-function-call 🤖 A function call request from the LLM, sent from the bot to the client. Note that for most cases, an LLM function call will be handled completely server-side. However, in the event that the call requires input from the client or the client needs to be aware of the function call, this message/response schema is required. type : 'llm-function-call' data : function_name : string Name of the function to be called. tool_call_id : string Unique identifier for this function call. args : Record Arguments to be passed to the function. ​ llm-function-call-result 🏄 The result of the function call requested by the LLM, returned from the client. type : 'llm-function-call-result' data : function_name : string Name of the called function. tool_call_id : string Identifier matching the original function call. args : Record Arguments that were passed to the function. result : Record | string The result returned by the function. ​ bot-llm-search-response 🤖 Search results from the LLM’s knowledge base. Currently, Google Gemini is the only LLM that supports built-in search. However, we expect other LLMs to follow suite, which is why this message type is defined as part of the RTVI standard. As more LLMs add support for this feature, the format of this message type may evolve to accommodate discrepancies. type : 'bot-llm-search-response' data : search_result : string (optional) Raw search result text. rendered_content : string (optional) Formatted version of the search results. origins : Array Source information and confidence scores for search results. The Origin Object follows this structure: Copy Ask AI { "site_uri" : string (optional) , "site_title" : string (optional) , "results" : Array< { "text" : string , "confidence" : number [] } > } Example: Copy Ask AI "id" : undefined "label" : "rtvi-ai" "type" : "bot-llm-search-response" "data" : { "origins" : [ { "results" : [ { "confidence" : [ 0.9881149530410768 ], "text" : "* Juneteenth: A Freedom Celebration is scheduled for June 18th from 12:00 pm to 2:00 pm." }, { "confidence" : [ 0.9692034721374512 ], "ext" : "* A Juneteenth celebration at Fort Negley Park will take place on June 19th from 5:00 pm to 9:30 pm." } ], "site_title" : "vanderbilt.edu" , "site_uri" : "https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHwif83VK9KAzrbMSGSBsKwL8vWfSfC9pgEWYKmStHyqiRoV1oe8j1S0nbwRg_iWgqAr9wUkiegu3ATC8Ll-cuE-vpzwElRHiJ2KgRYcqnOQMoOeokVpWqi" }, { "results" : [ { "confidence" : [ 0.6554043292999268 ], "text" : "In addition to these events, Vanderbilt University is a large research institution with ongoing activities across many fields." } ], "site_title" : "wikipedia.org" , "site_uri" : "https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQESbF-ijx78QbaglrhflHCUWdPTD4M6tYOQigW5hgsHNctRlAHu9ktfPmJx7DfoP5QicE0y-OQY1cRl9w4Id0btiFgLYSKIm2-SPtOHXeNrAlgA7mBnclaGrD7rgnLIbrjl8DgUEJrrvT0CKzuo" }], "rendered_content" : "

Pipecat is an open source Python framework that handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions. “Multimodal” means you can use any combination of audio, video, images, and/or text in your interactions. And “real-time” means that things are happening quickly enough that it feels conversational—a “back-and-forth” with a bot, not submitting a query and waiting for results.

+

What You Can Build

+ +

How It Works

+

The flow of interactions in a Pipecat application is typically straightforward:

+
    +
  1. The bot says something
  2. +
  3. The user says something
  4. +
  5. The bot says something
  6. +
  7. The user says something
  8. +
+

This continues until the conversation naturally ends. While this flow seems simple, making it feel natural requires sophisticated real-time processing.

+

Real-time Processing

+

Pipecat’s pipeline architecture handles both simple voice interactions and complex multimodal processing. Let’s look at how data flows through the system:

+
1

Send Audio

Transmit and capture streamed audio from the user

2

Transcribe Speech

Convert speech to text as the user is talking

3

Process with LLM

Generate responses using a large language model

4

Convert to Speech

Transform text responses into natural speech

5

Play Audio

Stream the audio response back to the user
+

In both cases, Pipecat:

+
    +
  • Processes responses as they stream in
  • +
  • Handles multiple input/output modalities concurrently
  • +
  • Manages resource allocation and synchronization
  • +
  • Coordinates parallel processing tasks
  • +
+

This architecture creates fluid, natural interactions without noticeable delays, whether you’re building a simple voice assistant or a complex multimodal application. Pipecat’s pipeline architecture is particularly valuable for managing the complexity of real-time, multimodal interactions, ensuring smooth data flow and proper synchronization regardless of the input/output types involved.

+

Pipecat handles all this complexity for you, letting you focus on building +your application rather than managing the underlying infrastructure.

+

Next Steps

+

Ready to build your first Pipecat application?

+ +

Join Our Community

+

Discord Community

Connect with other developers, share your projects, and get support from the +Pipecat team.

\ No newline at end of file diff --git a/ios_introduction_70040911.txt b/ios_introduction_70040911.txt new file mode 100644 index 0000000000000000000000000000000000000000..c75ce75f0fd99393c8c3c0c72d074a121a9c8075 --- /dev/null +++ b/ios_introduction_70040911.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/client/ios/introduction +Title: SDK Introduction - Pipecat +================================================== + +SDK Introduction - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation iOS SDK SDK Introduction Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Client SDKs The RTVI Standard RTVIClient Migration Guide Javascript SDK SDK Introduction API Reference Transport packages React SDK SDK Introduction API Reference React Native SDK SDK Introduction API Reference iOS SDK SDK Introduction API Reference Transport packages Android SDK SDK Introduction API Reference Transport packages C++ SDK SDK Introduction Daily WebRTC Transport The Pipecat iOS SDK provides a Swift implementation for building voice and multimodal AI applications on iOS. It handles: Real-time audio streaming Bot communication and state management Media device handling Configuration management Event handling ​ Installation Add the SDK to your project using Swift Package Manager: Copy Ask AI // Core SDK . package ( url : "https://github.com/pipecat-ai/pipecat-client-ios.git" , from : "0.3.0" ), // Daily transport implementation . package ( url : "https://github.com/pipecat-ai/pipecat-client-ios-daily.git" , from : "0.3.0" ), Then add the dependencies to your target: Copy Ask AI . target ( name : "YourApp" , dependencies : [ . product ( name : "PipecatClientIOS" , package : "pipecat-client-ios" ) . product ( name : "PipecatClientIOSDaily" , package : "pipecat-client-ios-daily" ) ]), ​ Example Here’s a simple example using Daily as the transport layer: Copy Ask AI import PipecatClientIOS import PipecatClientIOSDaily let clientConfig = [ ServiceConfig ( service : "llm" , options : [ Option ( name : "model" , value : . string ( "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo" )), Option ( name : "messages" , value : . array ([ . object ([ "role" : . string ( "system" ), "content" : . string ( "You are a helpful assistant." ) ]) ])) ] ), ServiceConfig ( service : "tts" , options : [ Option ( name : "voice" , value : . string ( "79a125e8-cd45-4c13-8a67-188112f4dd22" )) ] ) ] let options = RTVIClientOptions. init ( enableMic : true , params : RTVIClientParams ( baseUrl : $PIPECAT_API_URL, config : clientConfig ) ) let client = RTVIClient. init ( transport : DailyTransport. init ( options : configOptions), options : configOptions ) try await client. start () ​ Documentation API Reference Complete SDK API documentation Source Pipecat Client iOS Demo Simple Chatbot Demo Daily Transport WebRTC implementation using Daily API Reference API Reference On this page Installation Example Documentation Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_anthropic_11639834.txt b/llm_anthropic_11639834.txt new file mode 100644 index 0000000000000000000000000000000000000000..990351c9c49e359f3f71d7db8e5b31f107ece3c9 --- /dev/null +++ b/llm_anthropic_11639834.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/anthropic#metrics +Title: Anthropic - Pipecat +================================================== + +Anthropic - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Anthropic Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview AnthropicLLMService provides integration with Anthropic’s Claude models, supporting streaming responses, function calling, and prompt caching with specialized context handling for Anthropic’s message format. API Reference Complete API documentation and method details Anthropic Docs Official Anthropic API documentation and features Example Code Working example with function calling ​ Installation To use Anthropic services, install the required dependency: Copy Ask AI pip install "pipecat-ai[anthropic]" You’ll also need to set up your Anthropic API key as an environment variable: ANTHROPIC_API_KEY . Get your API key from Anthropic Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates LLMEnablePromptCachingFrame - Toggle prompt caching ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.anthropic.llm import AnthropicLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure the service llm = AnthropicLLMService( api_key = os.getenv( "ANTHROPIC_API_KEY" ), model = "claude-sonnet-4-20250514" , params = AnthropicLLMService.InputParams( temperature = 0.7 , enable_prompt_caching_beta = True ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" } }, required = [ "location" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context with system message context = OpenAILLMContext( messages = [{ "role" : "user" , "content" : "What's the weather like?" }], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler async def get_weather ( params ): location = params.arguments[ "location" ] await params.result_callback( f "Weather in { location } : 72°F and sunny" ) llm.register_function( "get_weather" , get_weather) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), # Handles user messages llm, # Processes with Anthropic tts, transport.output(), context_aggregator.assistant() # Captures responses ]) ​ Metrics The service provides: Time to First Byte (TTFB) - Latency from request to first response token Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and total usage Cache Metrics - Cache creation and read token usage Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes Streaming Responses : All responses are streamed for low latency Context Persistence : Use context aggregators to maintain conversation history Error Handling : Automatic retry logic for rate limits and transient errors Message Format : Automatically converts between OpenAI and Anthropic message formats Prompt Caching : Reduces costs and latency for repeated context patterns Whisper AWS Bedrock On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_anthropic_23126ef3.txt b/llm_anthropic_23126ef3.txt new file mode 100644 index 0000000000000000000000000000000000000000..404cbe125d352799d63e9b8b8b2bbd02637c1cb4 --- /dev/null +++ b/llm_anthropic_23126ef3.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/anthropic#frames +Title: Anthropic - Pipecat +================================================== + +Anthropic - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Anthropic Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview AnthropicLLMService provides integration with Anthropic’s Claude models, supporting streaming responses, function calling, and prompt caching with specialized context handling for Anthropic’s message format. API Reference Complete API documentation and method details Anthropic Docs Official Anthropic API documentation and features Example Code Working example with function calling ​ Installation To use Anthropic services, install the required dependency: Copy Ask AI pip install "pipecat-ai[anthropic]" You’ll also need to set up your Anthropic API key as an environment variable: ANTHROPIC_API_KEY . Get your API key from Anthropic Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates LLMEnablePromptCachingFrame - Toggle prompt caching ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.anthropic.llm import AnthropicLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure the service llm = AnthropicLLMService( api_key = os.getenv( "ANTHROPIC_API_KEY" ), model = "claude-sonnet-4-20250514" , params = AnthropicLLMService.InputParams( temperature = 0.7 , enable_prompt_caching_beta = True ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" } }, required = [ "location" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context with system message context = OpenAILLMContext( messages = [{ "role" : "user" , "content" : "What's the weather like?" }], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler async def get_weather ( params ): location = params.arguments[ "location" ] await params.result_callback( f "Weather in { location } : 72°F and sunny" ) llm.register_function( "get_weather" , get_weather) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), # Handles user messages llm, # Processes with Anthropic tts, transport.output(), context_aggregator.assistant() # Captures responses ]) ​ Metrics The service provides: Time to First Byte (TTFB) - Latency from request to first response token Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and total usage Cache Metrics - Cache creation and read token usage Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes Streaming Responses : All responses are streamed for low latency Context Persistence : Use context aggregators to maintain conversation history Error Handling : Automatic retry logic for rate limits and transient errors Message Format : Automatically converts between OpenAI and Anthropic message formats Prompt Caching : Reduces costs and latency for repeated context patterns Whisper AWS Bedrock On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_anthropic_60356b63.txt b/llm_anthropic_60356b63.txt new file mode 100644 index 0000000000000000000000000000000000000000..747ff417c81f593cf3c59d994ad3b3892a9ad9b7 --- /dev/null +++ b/llm_anthropic_60356b63.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/anthropic#output +Title: Anthropic - Pipecat +================================================== + +Anthropic - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Anthropic Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview AnthropicLLMService provides integration with Anthropic’s Claude models, supporting streaming responses, function calling, and prompt caching with specialized context handling for Anthropic’s message format. API Reference Complete API documentation and method details Anthropic Docs Official Anthropic API documentation and features Example Code Working example with function calling ​ Installation To use Anthropic services, install the required dependency: Copy Ask AI pip install "pipecat-ai[anthropic]" You’ll also need to set up your Anthropic API key as an environment variable: ANTHROPIC_API_KEY . Get your API key from Anthropic Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates LLMEnablePromptCachingFrame - Toggle prompt caching ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.anthropic.llm import AnthropicLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure the service llm = AnthropicLLMService( api_key = os.getenv( "ANTHROPIC_API_KEY" ), model = "claude-sonnet-4-20250514" , params = AnthropicLLMService.InputParams( temperature = 0.7 , enable_prompt_caching_beta = True ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" } }, required = [ "location" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context with system message context = OpenAILLMContext( messages = [{ "role" : "user" , "content" : "What's the weather like?" }], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler async def get_weather ( params ): location = params.arguments[ "location" ] await params.result_callback( f "Weather in { location } : 72°F and sunny" ) llm.register_function( "get_weather" , get_weather) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), # Handles user messages llm, # Processes with Anthropic tts, transport.output(), context_aggregator.assistant() # Captures responses ]) ​ Metrics The service provides: Time to First Byte (TTFB) - Latency from request to first response token Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and total usage Cache Metrics - Cache creation and read token usage Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes Streaming Responses : All responses are streamed for low latency Context Persistence : Use context aggregators to maintain conversation history Error Handling : Automatic retry logic for rate limits and transient errors Message Format : Automatically converts between OpenAI and Anthropic message formats Prompt Caching : Reduces costs and latency for repeated context patterns Whisper AWS Bedrock On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_anthropic_d00e3ad0.txt b/llm_anthropic_d00e3ad0.txt new file mode 100644 index 0000000000000000000000000000000000000000..4d03b8fcdf2bc4df1a297582411bd03b8aa1b7ad --- /dev/null +++ b/llm_anthropic_d00e3ad0.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/anthropic +Title: Anthropic - Pipecat +================================================== + +Anthropic - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Anthropic Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview AnthropicLLMService provides integration with Anthropic’s Claude models, supporting streaming responses, function calling, and prompt caching with specialized context handling for Anthropic’s message format. API Reference Complete API documentation and method details Anthropic Docs Official Anthropic API documentation and features Example Code Working example with function calling ​ Installation To use Anthropic services, install the required dependency: Copy Ask AI pip install "pipecat-ai[anthropic]" You’ll also need to set up your Anthropic API key as an environment variable: ANTHROPIC_API_KEY . Get your API key from Anthropic Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates LLMEnablePromptCachingFrame - Toggle prompt caching ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.anthropic.llm import AnthropicLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure the service llm = AnthropicLLMService( api_key = os.getenv( "ANTHROPIC_API_KEY" ), model = "claude-sonnet-4-20250514" , params = AnthropicLLMService.InputParams( temperature = 0.7 , enable_prompt_caching_beta = True ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" } }, required = [ "location" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context with system message context = OpenAILLMContext( messages = [{ "role" : "user" , "content" : "What's the weather like?" }], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler async def get_weather ( params ): location = params.arguments[ "location" ] await params.result_callback( f "Weather in { location } : 72°F and sunny" ) llm.register_function( "get_weather" , get_weather) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), # Handles user messages llm, # Processes with Anthropic tts, transport.output(), context_aggregator.assistant() # Captures responses ]) ​ Metrics The service provides: Time to First Byte (TTFB) - Latency from request to first response token Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and total usage Cache Metrics - Cache creation and read token usage Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes Streaming Responses : All responses are streamed for low latency Context Persistence : Use context aggregators to maintain conversation history Error Handling : Automatic retry logic for rate limits and transient errors Message Format : Automatically converts between OpenAI and Anthropic message formats Prompt Caching : Reduces costs and latency for repeated context patterns Whisper AWS Bedrock On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_azure_12f2bcc6.txt b/llm_azure_12f2bcc6.txt new file mode 100644 index 0000000000000000000000000000000000000000..a6f1476e3c109b6ce43423e1f868d08a9c6b611c --- /dev/null +++ b/llm_azure_12f2bcc6.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/azure#azure-vs-openai-differences +Title: Azure - Pipecat +================================================== + +Azure - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Azure Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview AzureLLMService provides access to Azure OpenAI’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Azure OpenAI Docs Official Azure OpenAI documentation and setup Example Code Working example with function calling ​ Installation To use Azure OpenAI services, install the required dependency: Copy Ask AI pip install "pipecat-ai[azure]" You’ll need to set up your Azure OpenAI credentials: AZURE_CHATGPT_API_KEY - Your Azure OpenAI API key AZURE_CHATGPT_ENDPOINT - Your Azure OpenAI endpoint URL AZURE_CHATGPT_MODEL - Your model deployment name Get your credentials from the Azure Portal under your Azure OpenAI resource. ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Azure vs OpenAI Differences Feature Azure OpenAI Standard OpenAI Authentication API key + endpoint API key only Deployment Custom deployment names Model names directly Compliance Enterprise SOC, HIPAA Standard compliance Regional Multiple Azure regions OpenAI regions only Pricing Azure billing integration OpenAI billing ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.azure.llm import AzureLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Azure OpenAI service llm = AzureLLMService( api_key = os.getenv( "AZURE_CHATGPT_API_KEY" ), endpoint = os.getenv( "AZURE_CHATGPT_ENDPOINT" ), model = os.getenv( "AZURE_CHATGPT_MODEL" ), # Your deployment name params = AzureLLMService.InputParams( temperature = 0.7 , max_completion_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant. Keep responses concise for voice output." } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with event callback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Regional Deployment : Deploy in your preferred Azure region for compliance and latency Deployment Names : Use your Azure deployment name as the model parameter, not OpenAI model names Automatic Retries : Built-in retry logic handles transient Azure service issues AWS Bedrock Cerebras On this page Overview Installation Frames Input Output Azure vs OpenAI Differences Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_azure_249074f2.txt b/llm_azure_249074f2.txt new file mode 100644 index 0000000000000000000000000000000000000000..d47358eb0f2b70fad4aafe8d8afe2d956b86d3b7 --- /dev/null +++ b/llm_azure_249074f2.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/azure#additional-notes +Title: Azure - Pipecat +================================================== + +Azure - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Azure Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview AzureLLMService provides access to Azure OpenAI’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Azure OpenAI Docs Official Azure OpenAI documentation and setup Example Code Working example with function calling ​ Installation To use Azure OpenAI services, install the required dependency: Copy Ask AI pip install "pipecat-ai[azure]" You’ll need to set up your Azure OpenAI credentials: AZURE_CHATGPT_API_KEY - Your Azure OpenAI API key AZURE_CHATGPT_ENDPOINT - Your Azure OpenAI endpoint URL AZURE_CHATGPT_MODEL - Your model deployment name Get your credentials from the Azure Portal under your Azure OpenAI resource. ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Azure vs OpenAI Differences Feature Azure OpenAI Standard OpenAI Authentication API key + endpoint API key only Deployment Custom deployment names Model names directly Compliance Enterprise SOC, HIPAA Standard compliance Regional Multiple Azure regions OpenAI regions only Pricing Azure billing integration OpenAI billing ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.azure.llm import AzureLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Azure OpenAI service llm = AzureLLMService( api_key = os.getenv( "AZURE_CHATGPT_API_KEY" ), endpoint = os.getenv( "AZURE_CHATGPT_ENDPOINT" ), model = os.getenv( "AZURE_CHATGPT_MODEL" ), # Your deployment name params = AzureLLMService.InputParams( temperature = 0.7 , max_completion_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant. Keep responses concise for voice output." } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with event callback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Regional Deployment : Deploy in your preferred Azure region for compliance and latency Deployment Names : Use your Azure deployment name as the model parameter, not OpenAI model names Automatic Retries : Built-in retry logic handles transient Azure service issues AWS Bedrock Cerebras On this page Overview Installation Frames Input Output Azure vs OpenAI Differences Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_azure_52726914.txt b/llm_azure_52726914.txt new file mode 100644 index 0000000000000000000000000000000000000000..a4d4e84e4f6ccf83b890dccceb69e283d59107d7 --- /dev/null +++ b/llm_azure_52726914.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/azure#output +Title: Azure - Pipecat +================================================== + +Azure - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Azure Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview AzureLLMService provides access to Azure OpenAI’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Azure OpenAI Docs Official Azure OpenAI documentation and setup Example Code Working example with function calling ​ Installation To use Azure OpenAI services, install the required dependency: Copy Ask AI pip install "pipecat-ai[azure]" You’ll need to set up your Azure OpenAI credentials: AZURE_CHATGPT_API_KEY - Your Azure OpenAI API key AZURE_CHATGPT_ENDPOINT - Your Azure OpenAI endpoint URL AZURE_CHATGPT_MODEL - Your model deployment name Get your credentials from the Azure Portal under your Azure OpenAI resource. ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Azure vs OpenAI Differences Feature Azure OpenAI Standard OpenAI Authentication API key + endpoint API key only Deployment Custom deployment names Model names directly Compliance Enterprise SOC, HIPAA Standard compliance Regional Multiple Azure regions OpenAI regions only Pricing Azure billing integration OpenAI billing ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.azure.llm import AzureLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Azure OpenAI service llm = AzureLLMService( api_key = os.getenv( "AZURE_CHATGPT_API_KEY" ), endpoint = os.getenv( "AZURE_CHATGPT_ENDPOINT" ), model = os.getenv( "AZURE_CHATGPT_MODEL" ), # Your deployment name params = AzureLLMService.InputParams( temperature = 0.7 , max_completion_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant. Keep responses concise for voice output." } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with event callback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Regional Deployment : Deploy in your preferred Azure region for compliance and latency Deployment Names : Use your Azure deployment name as the model parameter, not OpenAI model names Automatic Retries : Built-in retry logic handles transient Azure service issues AWS Bedrock Cerebras On this page Overview Installation Frames Input Output Azure vs OpenAI Differences Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_azure_56823eb0.txt b/llm_azure_56823eb0.txt new file mode 100644 index 0000000000000000000000000000000000000000..67ca9ca7f7e13d79fbb47455074600ae8928e287 --- /dev/null +++ b/llm_azure_56823eb0.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/azure#frames +Title: Azure - Pipecat +================================================== + +Azure - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Azure Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview AzureLLMService provides access to Azure OpenAI’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Azure OpenAI Docs Official Azure OpenAI documentation and setup Example Code Working example with function calling ​ Installation To use Azure OpenAI services, install the required dependency: Copy Ask AI pip install "pipecat-ai[azure]" You’ll need to set up your Azure OpenAI credentials: AZURE_CHATGPT_API_KEY - Your Azure OpenAI API key AZURE_CHATGPT_ENDPOINT - Your Azure OpenAI endpoint URL AZURE_CHATGPT_MODEL - Your model deployment name Get your credentials from the Azure Portal under your Azure OpenAI resource. ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Azure vs OpenAI Differences Feature Azure OpenAI Standard OpenAI Authentication API key + endpoint API key only Deployment Custom deployment names Model names directly Compliance Enterprise SOC, HIPAA Standard compliance Regional Multiple Azure regions OpenAI regions only Pricing Azure billing integration OpenAI billing ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.azure.llm import AzureLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Azure OpenAI service llm = AzureLLMService( api_key = os.getenv( "AZURE_CHATGPT_API_KEY" ), endpoint = os.getenv( "AZURE_CHATGPT_ENDPOINT" ), model = os.getenv( "AZURE_CHATGPT_MODEL" ), # Your deployment name params = AzureLLMService.InputParams( temperature = 0.7 , max_completion_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant. Keep responses concise for voice output." } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with event callback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Regional Deployment : Deploy in your preferred Azure region for compliance and latency Deployment Names : Use your Azure deployment name as the model parameter, not OpenAI model names Automatic Retries : Built-in retry logic handles transient Azure service issues AWS Bedrock Cerebras On this page Overview Installation Frames Input Output Azure vs OpenAI Differences Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_azure_6658be14.txt b/llm_azure_6658be14.txt new file mode 100644 index 0000000000000000000000000000000000000000..dd54f8f2b39651220ab5d29d6649c983dc49c0b9 --- /dev/null +++ b/llm_azure_6658be14.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/azure +Title: Azure - Pipecat +================================================== + +Azure - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Azure Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview AzureLLMService provides access to Azure OpenAI’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Azure OpenAI Docs Official Azure OpenAI documentation and setup Example Code Working example with function calling ​ Installation To use Azure OpenAI services, install the required dependency: Copy Ask AI pip install "pipecat-ai[azure]" You’ll need to set up your Azure OpenAI credentials: AZURE_CHATGPT_API_KEY - Your Azure OpenAI API key AZURE_CHATGPT_ENDPOINT - Your Azure OpenAI endpoint URL AZURE_CHATGPT_MODEL - Your model deployment name Get your credentials from the Azure Portal under your Azure OpenAI resource. ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Azure vs OpenAI Differences Feature Azure OpenAI Standard OpenAI Authentication API key + endpoint API key only Deployment Custom deployment names Model names directly Compliance Enterprise SOC, HIPAA Standard compliance Regional Multiple Azure regions OpenAI regions only Pricing Azure billing integration OpenAI billing ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.azure.llm import AzureLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Azure OpenAI service llm = AzureLLMService( api_key = os.getenv( "AZURE_CHATGPT_API_KEY" ), endpoint = os.getenv( "AZURE_CHATGPT_ENDPOINT" ), model = os.getenv( "AZURE_CHATGPT_MODEL" ), # Your deployment name params = AzureLLMService.InputParams( temperature = 0.7 , max_completion_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant. Keep responses concise for voice output." } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with event callback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Regional Deployment : Deploy in your preferred Azure region for compliance and latency Deployment Names : Use your Azure deployment name as the model parameter, not OpenAI model names Automatic Retries : Built-in retry logic handles transient Azure service issues AWS Bedrock Cerebras On this page Overview Installation Frames Input Output Azure vs OpenAI Differences Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_azure_95f603e8.txt b/llm_azure_95f603e8.txt new file mode 100644 index 0000000000000000000000000000000000000000..64a77c0d1bf84dc2b5204a078773e9020cc1a926 --- /dev/null +++ b/llm_azure_95f603e8.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/azure#installation +Title: Azure - Pipecat +================================================== + +Azure - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Azure Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview AzureLLMService provides access to Azure OpenAI’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Azure OpenAI Docs Official Azure OpenAI documentation and setup Example Code Working example with function calling ​ Installation To use Azure OpenAI services, install the required dependency: Copy Ask AI pip install "pipecat-ai[azure]" You’ll need to set up your Azure OpenAI credentials: AZURE_CHATGPT_API_KEY - Your Azure OpenAI API key AZURE_CHATGPT_ENDPOINT - Your Azure OpenAI endpoint URL AZURE_CHATGPT_MODEL - Your model deployment name Get your credentials from the Azure Portal under your Azure OpenAI resource. ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Azure vs OpenAI Differences Feature Azure OpenAI Standard OpenAI Authentication API key + endpoint API key only Deployment Custom deployment names Model names directly Compliance Enterprise SOC, HIPAA Standard compliance Regional Multiple Azure regions OpenAI regions only Pricing Azure billing integration OpenAI billing ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.azure.llm import AzureLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Azure OpenAI service llm = AzureLLMService( api_key = os.getenv( "AZURE_CHATGPT_API_KEY" ), endpoint = os.getenv( "AZURE_CHATGPT_ENDPOINT" ), model = os.getenv( "AZURE_CHATGPT_MODEL" ), # Your deployment name params = AzureLLMService.InputParams( temperature = 0.7 , max_completion_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant. Keep responses concise for voice output." } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with event callback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Regional Deployment : Deploy in your preferred Azure region for compliance and latency Deployment Names : Use your Azure deployment name as the model parameter, not OpenAI model names Automatic Retries : Built-in retry logic handles transient Azure service issues AWS Bedrock Cerebras On this page Overview Installation Frames Input Output Azure vs OpenAI Differences Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_azure_bf8067b5.txt b/llm_azure_bf8067b5.txt new file mode 100644 index 0000000000000000000000000000000000000000..a608876cfdf9cea24bab0b5a62d6fa4c47d04f75 --- /dev/null +++ b/llm_azure_bf8067b5.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/azure#overview +Title: Azure - Pipecat +================================================== + +Azure - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Azure Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview AzureLLMService provides access to Azure OpenAI’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Azure OpenAI Docs Official Azure OpenAI documentation and setup Example Code Working example with function calling ​ Installation To use Azure OpenAI services, install the required dependency: Copy Ask AI pip install "pipecat-ai[azure]" You’ll need to set up your Azure OpenAI credentials: AZURE_CHATGPT_API_KEY - Your Azure OpenAI API key AZURE_CHATGPT_ENDPOINT - Your Azure OpenAI endpoint URL AZURE_CHATGPT_MODEL - Your model deployment name Get your credentials from the Azure Portal under your Azure OpenAI resource. ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Azure vs OpenAI Differences Feature Azure OpenAI Standard OpenAI Authentication API key + endpoint API key only Deployment Custom deployment names Model names directly Compliance Enterprise SOC, HIPAA Standard compliance Regional Multiple Azure regions OpenAI regions only Pricing Azure billing integration OpenAI billing ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.azure.llm import AzureLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Azure OpenAI service llm = AzureLLMService( api_key = os.getenv( "AZURE_CHATGPT_API_KEY" ), endpoint = os.getenv( "AZURE_CHATGPT_ENDPOINT" ), model = os.getenv( "AZURE_CHATGPT_MODEL" ), # Your deployment name params = AzureLLMService.InputParams( temperature = 0.7 , max_completion_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant. Keep responses concise for voice output." } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with event callback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Regional Deployment : Deploy in your preferred Azure region for compliance and latency Deployment Names : Use your Azure deployment name as the model parameter, not OpenAI model names Automatic Retries : Built-in retry logic handles transient Azure service issues AWS Bedrock Cerebras On this page Overview Installation Frames Input Output Azure vs OpenAI Differences Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_azure_d1a83f87.txt b/llm_azure_d1a83f87.txt new file mode 100644 index 0000000000000000000000000000000000000000..72301866d9bc883279d984a426cb7b9afcc7665a --- /dev/null +++ b/llm_azure_d1a83f87.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/azure#input +Title: Azure - Pipecat +================================================== + +Azure - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Azure Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview AzureLLMService provides access to Azure OpenAI’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Azure OpenAI Docs Official Azure OpenAI documentation and setup Example Code Working example with function calling ​ Installation To use Azure OpenAI services, install the required dependency: Copy Ask AI pip install "pipecat-ai[azure]" You’ll need to set up your Azure OpenAI credentials: AZURE_CHATGPT_API_KEY - Your Azure OpenAI API key AZURE_CHATGPT_ENDPOINT - Your Azure OpenAI endpoint URL AZURE_CHATGPT_MODEL - Your model deployment name Get your credentials from the Azure Portal under your Azure OpenAI resource. ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Azure vs OpenAI Differences Feature Azure OpenAI Standard OpenAI Authentication API key + endpoint API key only Deployment Custom deployment names Model names directly Compliance Enterprise SOC, HIPAA Standard compliance Regional Multiple Azure regions OpenAI regions only Pricing Azure billing integration OpenAI billing ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.azure.llm import AzureLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Azure OpenAI service llm = AzureLLMService( api_key = os.getenv( "AZURE_CHATGPT_API_KEY" ), endpoint = os.getenv( "AZURE_CHATGPT_ENDPOINT" ), model = os.getenv( "AZURE_CHATGPT_MODEL" ), # Your deployment name params = AzureLLMService.InputParams( temperature = 0.7 , max_completion_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant. Keep responses concise for voice output." } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with event callback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Regional Deployment : Deploy in your preferred Azure region for compliance and latency Deployment Names : Use your Azure deployment name as the model parameter, not OpenAI model names Automatic Retries : Built-in retry logic handles transient Azure service issues AWS Bedrock Cerebras On this page Overview Installation Frames Input Output Azure vs OpenAI Differences Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_azure_f64aa812.txt b/llm_azure_f64aa812.txt new file mode 100644 index 0000000000000000000000000000000000000000..e3dd9a383ae25d76972a75cd08b2a7c2ba071e49 --- /dev/null +++ b/llm_azure_f64aa812.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/azure#metrics +Title: Azure - Pipecat +================================================== + +Azure - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Azure Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview AzureLLMService provides access to Azure OpenAI’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Azure OpenAI Docs Official Azure OpenAI documentation and setup Example Code Working example with function calling ​ Installation To use Azure OpenAI services, install the required dependency: Copy Ask AI pip install "pipecat-ai[azure]" You’ll need to set up your Azure OpenAI credentials: AZURE_CHATGPT_API_KEY - Your Azure OpenAI API key AZURE_CHATGPT_ENDPOINT - Your Azure OpenAI endpoint URL AZURE_CHATGPT_MODEL - Your model deployment name Get your credentials from the Azure Portal under your Azure OpenAI resource. ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Azure vs OpenAI Differences Feature Azure OpenAI Standard OpenAI Authentication API key + endpoint API key only Deployment Custom deployment names Model names directly Compliance Enterprise SOC, HIPAA Standard compliance Regional Multiple Azure regions OpenAI regions only Pricing Azure billing integration OpenAI billing ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.azure.llm import AzureLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Azure OpenAI service llm = AzureLLMService( api_key = os.getenv( "AZURE_CHATGPT_API_KEY" ), endpoint = os.getenv( "AZURE_CHATGPT_ENDPOINT" ), model = os.getenv( "AZURE_CHATGPT_MODEL" ), # Your deployment name params = AzureLLMService.InputParams( temperature = 0.7 , max_completion_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant. Keep responses concise for voice output." } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with event callback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Regional Deployment : Deploy in your preferred Azure region for compliance and latency Deployment Names : Use your Azure deployment name as the model parameter, not OpenAI model names Automatic Retries : Built-in retry logic handles transient Azure service issues AWS Bedrock Cerebras On this page Overview Installation Frames Input Output Azure vs OpenAI Differences Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_cerebras_37a3d337.txt b/llm_cerebras_37a3d337.txt new file mode 100644 index 0000000000000000000000000000000000000000..3636bf51e96ab17996e144a19c795ea8e5645c92 --- /dev/null +++ b/llm_cerebras_37a3d337.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/cerebras#usage-example +Title: Cerebras - Pipecat +================================================== + +Cerebras - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Cerebras Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview CerebrasLLMService provides access to Cerebras’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Cerebras Docs Official Cerebras inference API documentation Example Code Working example with function calling ​ Installation To use Cerebras services, install the required dependency: Copy Ask AI pip install "pipecat-ai[cerebras]" You’ll also need to set up your Cerebras API key as an environment variable: CEREBRAS_API_KEY . Get your API key from Cerebras Cloud . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.cerebras.llm import CerebrasLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure the service llm = CerebrasLLMService( api_key = os.getenv( "CEREBRAS_API_KEY" ), model = "llama-3.3-70b" , params = CerebrasLLMService.InputParams( temperature = 0.7 , max_completion_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant for weather information. Keep responses concise for voice output." } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI-compatible metrics: Time to First Byte (TTFB) - Ultra-low latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API parameters and responses Streaming Responses : All responses are streamed for minimal latency Function Calling : Full support for OpenAI-style tool calling Open Source Models : Access to latest Llama models with commercial licensing Azure DeepSeek On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_cerebras_72c0ccd8.txt b/llm_cerebras_72c0ccd8.txt new file mode 100644 index 0000000000000000000000000000000000000000..5a96b15d5b66f240129e810fd5a619367d776045 --- /dev/null +++ b/llm_cerebras_72c0ccd8.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/cerebras#frames +Title: Cerebras - Pipecat +================================================== + +Cerebras - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Cerebras Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview CerebrasLLMService provides access to Cerebras’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Cerebras Docs Official Cerebras inference API documentation Example Code Working example with function calling ​ Installation To use Cerebras services, install the required dependency: Copy Ask AI pip install "pipecat-ai[cerebras]" You’ll also need to set up your Cerebras API key as an environment variable: CEREBRAS_API_KEY . Get your API key from Cerebras Cloud . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.cerebras.llm import CerebrasLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure the service llm = CerebrasLLMService( api_key = os.getenv( "CEREBRAS_API_KEY" ), model = "llama-3.3-70b" , params = CerebrasLLMService.InputParams( temperature = 0.7 , max_completion_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant for weather information. Keep responses concise for voice output." } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI-compatible metrics: Time to First Byte (TTFB) - Ultra-low latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API parameters and responses Streaming Responses : All responses are streamed for minimal latency Function Calling : Full support for OpenAI-style tool calling Open Source Models : Access to latest Llama models with commercial licensing Azure DeepSeek On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_deepseek_5997947f.txt b/llm_deepseek_5997947f.txt new file mode 100644 index 0000000000000000000000000000000000000000..8f7b0d2e769ad8ac1d1b797c1481fbcf22e15142 --- /dev/null +++ b/llm_deepseek_5997947f.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/deepseek#input +Title: DeepSeek - Pipecat +================================================== + +DeepSeek - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM DeepSeek Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview DeepSeekLLMService provides access to DeepSeek’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details DeepSeek Docs Official DeepSeek API documentation and features Example Code Working example with function calling ​ Installation To use DeepSeek services, install the required dependency: Copy Ask AI pip install "pipecat-ai[deepseek]" You’ll also need to set up your DeepSeek API key as an environment variable: DEEPSEEK_API_KEY . Get your API key from DeepSeek Platform . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.deepseek.llm import DeepSeekLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure DeepSeek service llm = DeepSeekLLMService( api_key = os.getenv( "DEEPSEEK_API_KEY" ), model = "deepseek-chat" , params = DeepSeekLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context with reasoning-focused system message context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant with strong reasoning capabilities. Infer temperature units based on location unless specified. Provide logical, step-by-step responses.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Cost Efficiency : Competitive pricing compared to other high-capability models Streaming Support : Real-time response streaming for low-latency applications Cerebras Fireworks AI On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_fireworks_491acc32.txt b/llm_fireworks_491acc32.txt new file mode 100644 index 0000000000000000000000000000000000000000..f6b609b1f5f0d662f1c99ad1ce333a803ae4ddd4 --- /dev/null +++ b/llm_fireworks_491acc32.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/fireworks#usage-example +Title: Fireworks AI - Pipecat +================================================== + +Fireworks AI - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Fireworks AI Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview FireworksLLMService provides access to Fireworks AI’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Fireworks Docs Official Fireworks AI API documentation and features Example Code Working example with function calling ​ Installation To use Fireworks AI services, install the required dependency: Copy Ask AI pip install "pipecat-ai[fireworks]" You’ll also need to set up your Fireworks API key as an environment variable: FIREWORKS_API_KEY . Get your API key from Fireworks AI Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.fireworks.llm import FireworksLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Fireworks service llm = FireworksLLMService( api_key = os.getenv( "FIREWORKS_API_KEY" ), model = "accounts/fireworks/models/firefunction-v2" , # Optimized for function calling params = FireworksLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant optimized for voice interactions. Keep responses concise and avoid special characters for audio output.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Function Calling : Specialized firefunction models optimized for tool use Cost Effective : Competitive pricing for open-source model inference DeepSeek Google Gemini On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_fireworks_a67b9d2e.txt b/llm_fireworks_a67b9d2e.txt new file mode 100644 index 0000000000000000000000000000000000000000..95dda92bfe247cbf71aca4b4eee63da809649b46 --- /dev/null +++ b/llm_fireworks_a67b9d2e.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/fireworks#installation +Title: Fireworks AI - Pipecat +================================================== + +Fireworks AI - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Fireworks AI Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview FireworksLLMService provides access to Fireworks AI’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Fireworks Docs Official Fireworks AI API documentation and features Example Code Working example with function calling ​ Installation To use Fireworks AI services, install the required dependency: Copy Ask AI pip install "pipecat-ai[fireworks]" You’ll also need to set up your Fireworks API key as an environment variable: FIREWORKS_API_KEY . Get your API key from Fireworks AI Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.fireworks.llm import FireworksLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Fireworks service llm = FireworksLLMService( api_key = os.getenv( "FIREWORKS_API_KEY" ), model = "accounts/fireworks/models/firefunction-v2" , # Optimized for function calling params = FireworksLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant optimized for voice interactions. Keep responses concise and avoid special characters for audio output.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Function Calling : Specialized firefunction models optimized for tool use Cost Effective : Competitive pricing for open-source model inference DeepSeek Google Gemini On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_fireworks_ed73fb74.txt b/llm_fireworks_ed73fb74.txt new file mode 100644 index 0000000000000000000000000000000000000000..f610bf25d22b3dd736bce0ab38463478c19b84ed --- /dev/null +++ b/llm_fireworks_ed73fb74.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/fireworks#context-management +Title: Fireworks AI - Pipecat +================================================== + +Fireworks AI - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Fireworks AI Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview FireworksLLMService provides access to Fireworks AI’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Fireworks Docs Official Fireworks AI API documentation and features Example Code Working example with function calling ​ Installation To use Fireworks AI services, install the required dependency: Copy Ask AI pip install "pipecat-ai[fireworks]" You’ll also need to set up your Fireworks API key as an environment variable: FIREWORKS_API_KEY . Get your API key from Fireworks AI Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.fireworks.llm import FireworksLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Fireworks service llm = FireworksLLMService( api_key = os.getenv( "FIREWORKS_API_KEY" ), model = "accounts/fireworks/models/firefunction-v2" , # Optimized for function calling params = FireworksLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant optimized for voice interactions. Keep responses concise and avoid special characters for audio output.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Function Calling : Specialized firefunction models optimized for tool use Cost Effective : Competitive pricing for open-source model inference DeepSeek Google Gemini On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_fireworks_fb18e3a6.txt b/llm_fireworks_fb18e3a6.txt new file mode 100644 index 0000000000000000000000000000000000000000..fd6e80c4cc6248ef560b52aceec2ad7f64d490c2 --- /dev/null +++ b/llm_fireworks_fb18e3a6.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/fireworks#output +Title: Fireworks AI - Pipecat +================================================== + +Fireworks AI - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Fireworks AI Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview FireworksLLMService provides access to Fireworks AI’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Fireworks Docs Official Fireworks AI API documentation and features Example Code Working example with function calling ​ Installation To use Fireworks AI services, install the required dependency: Copy Ask AI pip install "pipecat-ai[fireworks]" You’ll also need to set up your Fireworks API key as an environment variable: FIREWORKS_API_KEY . Get your API key from Fireworks AI Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.fireworks.llm import FireworksLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Fireworks service llm = FireworksLLMService( api_key = os.getenv( "FIREWORKS_API_KEY" ), model = "accounts/fireworks/models/firefunction-v2" , # Optimized for function calling params = FireworksLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant optimized for voice interactions. Keep responses concise and avoid special characters for audio output.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Function Calling : Specialized firefunction models optimized for tool use Cost Effective : Competitive pricing for open-source model inference DeepSeek Google Gemini On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_gemini_4c644b1d.txt b/llm_gemini_4c644b1d.txt new file mode 100644 index 0000000000000000000000000000000000000000..d381113a0d418762c2e9c9fe10b434a4e0cbee9d --- /dev/null +++ b/llm_gemini_4c644b1d.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/gemini#usage-example +Title: Google Gemini - Pipecat +================================================== + +Google Gemini - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Google Gemini Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview GoogleLLMService provides integration with Google’s Gemini models, supporting streaming responses, function calling, and multimodal inputs. It includes specialized context handling for Google’s message format while maintaining compatibility with OpenAI-style contexts. API Reference Complete API documentation and method details Gemini Docs Official Google Gemini API documentation and features Example Code Working example with function calling ​ Installation To use GoogleLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[google]" You’ll also need to set up your Google API key as an environment variable: GOOGLE_API_KEY . Get your API key from Google AI Studio . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks LLMSearchResponseFrame - Search grounding results with citations FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Search Grounding Google Gemini’s search grounding feature enables real-time web search integration, allowing the model to access current information and provide citations. This is particularly valuable for applications requiring up-to-date information. ​ Enabling Search Grounding Copy Ask AI # Configure search grounding tool search_tool = { "google_search_retrieval" : { "dynamic_retrieval_config" : { "mode" : "MODE_DYNAMIC" , "dynamic_threshold" : 0.3 , # Lower = more frequent grounding } } } # Initialize with search grounding llm = GoogleLLMService( api_key = os.getenv( "GOOGLE_API_KEY" ), model = "gemini-1.5-flash-002" , system_instruction = "You are a helpful assistant with access to current information." , tools = [search_tool] ) ​ Handling Search Results Search grounding produces LLMSearchResponseFrame with detailed citation information: Copy Ask AI @pipeline.event_handler ( "llm_search_response" ) async def handle_search_response ( frame ): print ( f "Search result: { frame.search_result } " ) print ( f "Sources: { len (frame.origins) } citations" ) for origin in frame.origins: print ( f "- { origin[ 'site_title' ] } : { origin[ 'site_uri' ] } " ) ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.google.llm import GoogleLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Gemini service with search grounding search_tool = { "google_search_retrieval" : { "dynamic_retrieval_config" : { "mode" : "MODE_DYNAMIC" , "dynamic_threshold" : 0.3 } } } llm = GoogleLLMService( api_key = os.getenv( "GOOGLE_API_KEY" ), model = "gemini-2.0-flash" , system_instruction = """You are a helpful assistant with access to current information. When users ask about recent events, use search to provide accurate, up-to-date information.""" , tools = [search_tool], params = GoogleLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" } }, required = [ "location" ] ) # Define image capture function for multimodal capabilities image_function = FunctionSchema( name = "get_image" , description = "Capture and analyze an image from the video stream" , properties = { "question" : { "type" : "string" , "description" : "Question about what to analyze in the image" } }, required = [ "question" ] ) tools = ToolsSchema( standard_tools = [weather_function, image_function]) # Create context with multimodal system prompt context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant with access to current information and vision capabilities. You can answer questions about weather, analyze images from video streams, and search for current information. Keep responses concise for voice output.""" }, { "role" : "user" , "content" : "Hello! What can you help me with?" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handlers async def get_weather ( params ): location = params.arguments[ "location" ] await params.result_callback( f "Weather in { location } : 72°F and sunny" ) async def get_image ( params ): question = params.arguments[ "question" ] # Request image from video stream await params.llm.request_image_frame( user_id = client_id, function_name = params.function_name, tool_call_id = params.tool_call_id, text_content = question ) await params.result_callback( f "Analyzing image for: { question } " ) llm.register_function( "get_weather" , get_weather) llm.register_function( "get_image" , get_image) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Google Gemini provides comprehensive usage tracking: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes Multimodal Capabilities : Native support for text, images, audio, and video processing Search Grounding : Real-time web search with automatic citation and source attribution System Instructions : Handle system messages differently than OpenAI - set during initialization Vision Functions : Built-in support for image capture and analysis from video streams Fireworks AI Google Vertex AI On this page Overview Installation Frames Input Output Search Grounding Enabling Search Grounding Handling Search Results Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_gemini_5ca2f93b.txt b/llm_gemini_5ca2f93b.txt new file mode 100644 index 0000000000000000000000000000000000000000..aaa7a0112f3c372a5f161dd3267b850c5f121c0a --- /dev/null +++ b/llm_gemini_5ca2f93b.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/gemini#search-grounding +Title: Google Gemini - Pipecat +================================================== + +Google Gemini - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Google Gemini Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview GoogleLLMService provides integration with Google’s Gemini models, supporting streaming responses, function calling, and multimodal inputs. It includes specialized context handling for Google’s message format while maintaining compatibility with OpenAI-style contexts. API Reference Complete API documentation and method details Gemini Docs Official Google Gemini API documentation and features Example Code Working example with function calling ​ Installation To use GoogleLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[google]" You’ll also need to set up your Google API key as an environment variable: GOOGLE_API_KEY . Get your API key from Google AI Studio . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks LLMSearchResponseFrame - Search grounding results with citations FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Search Grounding Google Gemini’s search grounding feature enables real-time web search integration, allowing the model to access current information and provide citations. This is particularly valuable for applications requiring up-to-date information. ​ Enabling Search Grounding Copy Ask AI # Configure search grounding tool search_tool = { "google_search_retrieval" : { "dynamic_retrieval_config" : { "mode" : "MODE_DYNAMIC" , "dynamic_threshold" : 0.3 , # Lower = more frequent grounding } } } # Initialize with search grounding llm = GoogleLLMService( api_key = os.getenv( "GOOGLE_API_KEY" ), model = "gemini-1.5-flash-002" , system_instruction = "You are a helpful assistant with access to current information." , tools = [search_tool] ) ​ Handling Search Results Search grounding produces LLMSearchResponseFrame with detailed citation information: Copy Ask AI @pipeline.event_handler ( "llm_search_response" ) async def handle_search_response ( frame ): print ( f "Search result: { frame.search_result } " ) print ( f "Sources: { len (frame.origins) } citations" ) for origin in frame.origins: print ( f "- { origin[ 'site_title' ] } : { origin[ 'site_uri' ] } " ) ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.google.llm import GoogleLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Gemini service with search grounding search_tool = { "google_search_retrieval" : { "dynamic_retrieval_config" : { "mode" : "MODE_DYNAMIC" , "dynamic_threshold" : 0.3 } } } llm = GoogleLLMService( api_key = os.getenv( "GOOGLE_API_KEY" ), model = "gemini-2.0-flash" , system_instruction = """You are a helpful assistant with access to current information. When users ask about recent events, use search to provide accurate, up-to-date information.""" , tools = [search_tool], params = GoogleLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" } }, required = [ "location" ] ) # Define image capture function for multimodal capabilities image_function = FunctionSchema( name = "get_image" , description = "Capture and analyze an image from the video stream" , properties = { "question" : { "type" : "string" , "description" : "Question about what to analyze in the image" } }, required = [ "question" ] ) tools = ToolsSchema( standard_tools = [weather_function, image_function]) # Create context with multimodal system prompt context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant with access to current information and vision capabilities. You can answer questions about weather, analyze images from video streams, and search for current information. Keep responses concise for voice output.""" }, { "role" : "user" , "content" : "Hello! What can you help me with?" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handlers async def get_weather ( params ): location = params.arguments[ "location" ] await params.result_callback( f "Weather in { location } : 72°F and sunny" ) async def get_image ( params ): question = params.arguments[ "question" ] # Request image from video stream await params.llm.request_image_frame( user_id = client_id, function_name = params.function_name, tool_call_id = params.tool_call_id, text_content = question ) await params.result_callback( f "Analyzing image for: { question } " ) llm.register_function( "get_weather" , get_weather) llm.register_function( "get_image" , get_image) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Google Gemini provides comprehensive usage tracking: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes Multimodal Capabilities : Native support for text, images, audio, and video processing Search Grounding : Real-time web search with automatic citation and source attribution System Instructions : Handle system messages differently than OpenAI - set during initialization Vision Functions : Built-in support for image capture and analysis from video streams Fireworks AI Google Vertex AI On this page Overview Installation Frames Input Output Search Grounding Enabling Search Grounding Handling Search Results Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_gemini_866dc380.txt b/llm_gemini_866dc380.txt new file mode 100644 index 0000000000000000000000000000000000000000..14181ad46d279e067161df0147b66808b3bf46c9 --- /dev/null +++ b/llm_gemini_866dc380.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/gemini#frames +Title: Google Gemini - Pipecat +================================================== + +Google Gemini - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Google Gemini Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview GoogleLLMService provides integration with Google’s Gemini models, supporting streaming responses, function calling, and multimodal inputs. It includes specialized context handling for Google’s message format while maintaining compatibility with OpenAI-style contexts. API Reference Complete API documentation and method details Gemini Docs Official Google Gemini API documentation and features Example Code Working example with function calling ​ Installation To use GoogleLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[google]" You’ll also need to set up your Google API key as an environment variable: GOOGLE_API_KEY . Get your API key from Google AI Studio . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks LLMSearchResponseFrame - Search grounding results with citations FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Search Grounding Google Gemini’s search grounding feature enables real-time web search integration, allowing the model to access current information and provide citations. This is particularly valuable for applications requiring up-to-date information. ​ Enabling Search Grounding Copy Ask AI # Configure search grounding tool search_tool = { "google_search_retrieval" : { "dynamic_retrieval_config" : { "mode" : "MODE_DYNAMIC" , "dynamic_threshold" : 0.3 , # Lower = more frequent grounding } } } # Initialize with search grounding llm = GoogleLLMService( api_key = os.getenv( "GOOGLE_API_KEY" ), model = "gemini-1.5-flash-002" , system_instruction = "You are a helpful assistant with access to current information." , tools = [search_tool] ) ​ Handling Search Results Search grounding produces LLMSearchResponseFrame with detailed citation information: Copy Ask AI @pipeline.event_handler ( "llm_search_response" ) async def handle_search_response ( frame ): print ( f "Search result: { frame.search_result } " ) print ( f "Sources: { len (frame.origins) } citations" ) for origin in frame.origins: print ( f "- { origin[ 'site_title' ] } : { origin[ 'site_uri' ] } " ) ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.google.llm import GoogleLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Gemini service with search grounding search_tool = { "google_search_retrieval" : { "dynamic_retrieval_config" : { "mode" : "MODE_DYNAMIC" , "dynamic_threshold" : 0.3 } } } llm = GoogleLLMService( api_key = os.getenv( "GOOGLE_API_KEY" ), model = "gemini-2.0-flash" , system_instruction = """You are a helpful assistant with access to current information. When users ask about recent events, use search to provide accurate, up-to-date information.""" , tools = [search_tool], params = GoogleLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" } }, required = [ "location" ] ) # Define image capture function for multimodal capabilities image_function = FunctionSchema( name = "get_image" , description = "Capture and analyze an image from the video stream" , properties = { "question" : { "type" : "string" , "description" : "Question about what to analyze in the image" } }, required = [ "question" ] ) tools = ToolsSchema( standard_tools = [weather_function, image_function]) # Create context with multimodal system prompt context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant with access to current information and vision capabilities. You can answer questions about weather, analyze images from video streams, and search for current information. Keep responses concise for voice output.""" }, { "role" : "user" , "content" : "Hello! What can you help me with?" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handlers async def get_weather ( params ): location = params.arguments[ "location" ] await params.result_callback( f "Weather in { location } : 72°F and sunny" ) async def get_image ( params ): question = params.arguments[ "question" ] # Request image from video stream await params.llm.request_image_frame( user_id = client_id, function_name = params.function_name, tool_call_id = params.tool_call_id, text_content = question ) await params.result_callback( f "Analyzing image for: { question } " ) llm.register_function( "get_weather" , get_weather) llm.register_function( "get_image" , get_image) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Google Gemini provides comprehensive usage tracking: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes Multimodal Capabilities : Native support for text, images, audio, and video processing Search Grounding : Real-time web search with automatic citation and source attribution System Instructions : Handle system messages differently than OpenAI - set during initialization Vision Functions : Built-in support for image capture and analysis from video streams Fireworks AI Google Vertex AI On this page Overview Installation Frames Input Output Search Grounding Enabling Search Grounding Handling Search Results Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_gemini_e902aaab.txt b/llm_gemini_e902aaab.txt new file mode 100644 index 0000000000000000000000000000000000000000..154f6f3e7a9cab64e1cdf09b54ee1a4f61079015 --- /dev/null +++ b/llm_gemini_e902aaab.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/gemini#context-management +Title: Google Gemini - Pipecat +================================================== + +Google Gemini - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Google Gemini Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview GoogleLLMService provides integration with Google’s Gemini models, supporting streaming responses, function calling, and multimodal inputs. It includes specialized context handling for Google’s message format while maintaining compatibility with OpenAI-style contexts. API Reference Complete API documentation and method details Gemini Docs Official Google Gemini API documentation and features Example Code Working example with function calling ​ Installation To use GoogleLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[google]" You’ll also need to set up your Google API key as an environment variable: GOOGLE_API_KEY . Get your API key from Google AI Studio . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks LLMSearchResponseFrame - Search grounding results with citations FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Search Grounding Google Gemini’s search grounding feature enables real-time web search integration, allowing the model to access current information and provide citations. This is particularly valuable for applications requiring up-to-date information. ​ Enabling Search Grounding Copy Ask AI # Configure search grounding tool search_tool = { "google_search_retrieval" : { "dynamic_retrieval_config" : { "mode" : "MODE_DYNAMIC" , "dynamic_threshold" : 0.3 , # Lower = more frequent grounding } } } # Initialize with search grounding llm = GoogleLLMService( api_key = os.getenv( "GOOGLE_API_KEY" ), model = "gemini-1.5-flash-002" , system_instruction = "You are a helpful assistant with access to current information." , tools = [search_tool] ) ​ Handling Search Results Search grounding produces LLMSearchResponseFrame with detailed citation information: Copy Ask AI @pipeline.event_handler ( "llm_search_response" ) async def handle_search_response ( frame ): print ( f "Search result: { frame.search_result } " ) print ( f "Sources: { len (frame.origins) } citations" ) for origin in frame.origins: print ( f "- { origin[ 'site_title' ] } : { origin[ 'site_uri' ] } " ) ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.google.llm import GoogleLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Gemini service with search grounding search_tool = { "google_search_retrieval" : { "dynamic_retrieval_config" : { "mode" : "MODE_DYNAMIC" , "dynamic_threshold" : 0.3 } } } llm = GoogleLLMService( api_key = os.getenv( "GOOGLE_API_KEY" ), model = "gemini-2.0-flash" , system_instruction = """You are a helpful assistant with access to current information. When users ask about recent events, use search to provide accurate, up-to-date information.""" , tools = [search_tool], params = GoogleLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" } }, required = [ "location" ] ) # Define image capture function for multimodal capabilities image_function = FunctionSchema( name = "get_image" , description = "Capture and analyze an image from the video stream" , properties = { "question" : { "type" : "string" , "description" : "Question about what to analyze in the image" } }, required = [ "question" ] ) tools = ToolsSchema( standard_tools = [weather_function, image_function]) # Create context with multimodal system prompt context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant with access to current information and vision capabilities. You can answer questions about weather, analyze images from video streams, and search for current information. Keep responses concise for voice output.""" }, { "role" : "user" , "content" : "Hello! What can you help me with?" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handlers async def get_weather ( params ): location = params.arguments[ "location" ] await params.result_callback( f "Weather in { location } : 72°F and sunny" ) async def get_image ( params ): question = params.arguments[ "question" ] # Request image from video stream await params.llm.request_image_frame( user_id = client_id, function_name = params.function_name, tool_call_id = params.tool_call_id, text_content = question ) await params.result_callback( f "Analyzing image for: { question } " ) llm.register_function( "get_weather" , get_weather) llm.register_function( "get_image" , get_image) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Google Gemini provides comprehensive usage tracking: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes Multimodal Capabilities : Native support for text, images, audio, and video processing Search Grounding : Real-time web search with automatic citation and source attribution System Instructions : Handle system messages differently than OpenAI - set during initialization Vision Functions : Built-in support for image capture and analysis from video streams Fireworks AI Google Vertex AI On this page Overview Installation Frames Input Output Search Grounding Enabling Search Grounding Handling Search Results Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_google-vertex_0ec9db9d.txt b/llm_google-vertex_0ec9db9d.txt new file mode 100644 index 0000000000000000000000000000000000000000..ca4676adfd3c9a490f8cb640a00f878f08519643 --- /dev/null +++ b/llm_google-vertex_0ec9db9d.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/google-vertex#param-project-id +Title: Google Vertex AI - Pipecat +================================================== + +Google Vertex AI - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Google Vertex AI Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview GoogleVertexLLMService provides access to Google’s language models through Vertex AI while maintaining an OpenAI-compatible interface. It inherits from OpenAILLMService and supports all the features of the OpenAI interface while connecting to Google’s AI services. ​ Installation To use GoogleVertexLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[google]" You’ll also need to set up Google Cloud credentials. You can either: Set the GOOGLE_APPLICATION_CREDENTIALS environment variable pointing to your service account JSON file Provide credentials directly to the service constructor ​ Configuration ​ Constructor Parameters ​ credentials Optional[str] JSON string of Google service account credentials ​ credentials_path Optional[str] Path to the Google service account JSON file ​ model str default: "google/gemini-2.0-flash-001" Model identifier ​ params InputParams Vertex AI specific parameters ​ Input Parameters Extends the OpenAI input parameters with Vertex AI specific options: ​ location str default: "us-east4" Google Cloud region where the model is deployed ​ project_id str required Google Cloud project ID Also inherits all OpenAI-compatible parameters: ​ frequency_penalty Optional[float] Reduces likelihood of repeating tokens based on their frequency. Range: [-2.0, 2.0] ​ max_tokens Optional[int] Maximum number of tokens to generate. Must be greater than or equal to 1 ​ presence_penalty Optional[float] Reduces likelihood of repeating any tokens that have appeared. Range: [-2.0, 2.0] ​ temperature Optional[float] Controls randomness in the output. Range: [0.0, 2.0] ​ top_p Optional[float] Controls diversity via nucleus sampling. Range: [0.0, 1.0] ​ Usage Example Copy Ask AI from pipecat.services.google.llm_vertex import GoogleVertexLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.task import PipelineParams, PipelineTask # Configure service llm = GoogleVertexLLMService( credentials_path = "/path/to/service-account.json" , model = "google/gemini-2.0-flash-001" , params = GoogleVertexLLMService.InputParams( project_id = "your-google-cloud-project-id" , location = "us-east4" ) ) # Create context with system message context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant in a voice conversation. Keep responses concise." } ] ) # Create context aggregator for message handling context_aggregator = llm.create_context_aggregator(context) # Set up pipeline pipeline = Pipeline([ transport.input(), context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) # Create and configure task task = PipelineTask( pipeline, params = PipelineParams( allow_interruptions = True , enable_metrics = True , enable_usage_metrics = True , ), ) ​ Authentication The service supports multiple authentication methods: Direct credentials string - Pass the JSON credentials as a string to the constructor Credentials file path - Provide a path to the service account JSON file Environment variable - Set GOOGLE_APPLICATION_CREDENTIALS to the path of your service account file The service automatically handles token refresh, with tokens having a 1-hour lifetime. ​ Methods See the LLM base class methods for additional functionality. ​ Function Calling This service supports function calling (also known as tool calling) through the OpenAI-compatible interface, which allows the LLM to request information from external services and APIs. Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Available Models Model Name Description google/gemini-2.0-flash-001 Fast, efficient text generation model google/gemini-2.0-pro-001 Comprehensive, high-quality model google/gemini-1.5-pro-001 Versatile multimodal model google/gemini-1.5-flash-001 Fast, efficient multimodal model See Google Vertex AI documentation for a complete list of supported models and their capabilities. ​ Frame Flow Inherits the OpenAI LLM Service frame flow: ​ Metrics Support The service collects standard LLM metrics: Token usage (prompt and completion) Processing duration Time to First Byte (TTFB) Function call metrics ​ Notes Uses Google Cloud’s Vertex AI API Maintains OpenAI-compatible interface Supports streaming responses Handles function calling Manages conversation context Includes token usage tracking Thread-safe processing Automatic token refresh Requires Google Cloud project setup Google Gemini Grok On this page Overview Installation Configuration Constructor Parameters Input Parameters Usage Example Authentication Methods Function Calling Available Models Frame Flow Metrics Support Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_google-vertex_77c37360.txt b/llm_google-vertex_77c37360.txt new file mode 100644 index 0000000000000000000000000000000000000000..f5619dbf4075baedb4a98d364a318ccbf6275701 --- /dev/null +++ b/llm_google-vertex_77c37360.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/google-vertex#usage-example +Title: Google Vertex AI - Pipecat +================================================== + +Google Vertex AI - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Google Vertex AI Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview GoogleVertexLLMService provides access to Google’s language models through Vertex AI while maintaining an OpenAI-compatible interface. It inherits from OpenAILLMService and supports all the features of the OpenAI interface while connecting to Google’s AI services. ​ Installation To use GoogleVertexLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[google]" You’ll also need to set up Google Cloud credentials. You can either: Set the GOOGLE_APPLICATION_CREDENTIALS environment variable pointing to your service account JSON file Provide credentials directly to the service constructor ​ Configuration ​ Constructor Parameters ​ credentials Optional[str] JSON string of Google service account credentials ​ credentials_path Optional[str] Path to the Google service account JSON file ​ model str default: "google/gemini-2.0-flash-001" Model identifier ​ params InputParams Vertex AI specific parameters ​ Input Parameters Extends the OpenAI input parameters with Vertex AI specific options: ​ location str default: "us-east4" Google Cloud region where the model is deployed ​ project_id str required Google Cloud project ID Also inherits all OpenAI-compatible parameters: ​ frequency_penalty Optional[float] Reduces likelihood of repeating tokens based on their frequency. Range: [-2.0, 2.0] ​ max_tokens Optional[int] Maximum number of tokens to generate. Must be greater than or equal to 1 ​ presence_penalty Optional[float] Reduces likelihood of repeating any tokens that have appeared. Range: [-2.0, 2.0] ​ temperature Optional[float] Controls randomness in the output. Range: [0.0, 2.0] ​ top_p Optional[float] Controls diversity via nucleus sampling. Range: [0.0, 1.0] ​ Usage Example Copy Ask AI from pipecat.services.google.llm_vertex import GoogleVertexLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.task import PipelineParams, PipelineTask # Configure service llm = GoogleVertexLLMService( credentials_path = "/path/to/service-account.json" , model = "google/gemini-2.0-flash-001" , params = GoogleVertexLLMService.InputParams( project_id = "your-google-cloud-project-id" , location = "us-east4" ) ) # Create context with system message context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant in a voice conversation. Keep responses concise." } ] ) # Create context aggregator for message handling context_aggregator = llm.create_context_aggregator(context) # Set up pipeline pipeline = Pipeline([ transport.input(), context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) # Create and configure task task = PipelineTask( pipeline, params = PipelineParams( allow_interruptions = True , enable_metrics = True , enable_usage_metrics = True , ), ) ​ Authentication The service supports multiple authentication methods: Direct credentials string - Pass the JSON credentials as a string to the constructor Credentials file path - Provide a path to the service account JSON file Environment variable - Set GOOGLE_APPLICATION_CREDENTIALS to the path of your service account file The service automatically handles token refresh, with tokens having a 1-hour lifetime. ​ Methods See the LLM base class methods for additional functionality. ​ Function Calling This service supports function calling (also known as tool calling) through the OpenAI-compatible interface, which allows the LLM to request information from external services and APIs. Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Available Models Model Name Description google/gemini-2.0-flash-001 Fast, efficient text generation model google/gemini-2.0-pro-001 Comprehensive, high-quality model google/gemini-1.5-pro-001 Versatile multimodal model google/gemini-1.5-flash-001 Fast, efficient multimodal model See Google Vertex AI documentation for a complete list of supported models and their capabilities. ​ Frame Flow Inherits the OpenAI LLM Service frame flow: ​ Metrics Support The service collects standard LLM metrics: Token usage (prompt and completion) Processing duration Time to First Byte (TTFB) Function call metrics ​ Notes Uses Google Cloud’s Vertex AI API Maintains OpenAI-compatible interface Supports streaming responses Handles function calling Manages conversation context Includes token usage tracking Thread-safe processing Automatic token refresh Requires Google Cloud project setup Google Gemini Grok On this page Overview Installation Configuration Constructor Parameters Input Parameters Usage Example Authentication Methods Function Calling Available Models Frame Flow Metrics Support Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_google-vertex_e3ed737e.txt b/llm_google-vertex_e3ed737e.txt new file mode 100644 index 0000000000000000000000000000000000000000..dbb95dc375a631bc1c1679908cca4cdea85c1e36 --- /dev/null +++ b/llm_google-vertex_e3ed737e.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/google-vertex#constructor-parameters +Title: Google Vertex AI - Pipecat +================================================== + +Google Vertex AI - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Google Vertex AI Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview GoogleVertexLLMService provides access to Google’s language models through Vertex AI while maintaining an OpenAI-compatible interface. It inherits from OpenAILLMService and supports all the features of the OpenAI interface while connecting to Google’s AI services. ​ Installation To use GoogleVertexLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[google]" You’ll also need to set up Google Cloud credentials. You can either: Set the GOOGLE_APPLICATION_CREDENTIALS environment variable pointing to your service account JSON file Provide credentials directly to the service constructor ​ Configuration ​ Constructor Parameters ​ credentials Optional[str] JSON string of Google service account credentials ​ credentials_path Optional[str] Path to the Google service account JSON file ​ model str default: "google/gemini-2.0-flash-001" Model identifier ​ params InputParams Vertex AI specific parameters ​ Input Parameters Extends the OpenAI input parameters with Vertex AI specific options: ​ location str default: "us-east4" Google Cloud region where the model is deployed ​ project_id str required Google Cloud project ID Also inherits all OpenAI-compatible parameters: ​ frequency_penalty Optional[float] Reduces likelihood of repeating tokens based on their frequency. Range: [-2.0, 2.0] ​ max_tokens Optional[int] Maximum number of tokens to generate. Must be greater than or equal to 1 ​ presence_penalty Optional[float] Reduces likelihood of repeating any tokens that have appeared. Range: [-2.0, 2.0] ​ temperature Optional[float] Controls randomness in the output. Range: [0.0, 2.0] ​ top_p Optional[float] Controls diversity via nucleus sampling. Range: [0.0, 1.0] ​ Usage Example Copy Ask AI from pipecat.services.google.llm_vertex import GoogleVertexLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.task import PipelineParams, PipelineTask # Configure service llm = GoogleVertexLLMService( credentials_path = "/path/to/service-account.json" , model = "google/gemini-2.0-flash-001" , params = GoogleVertexLLMService.InputParams( project_id = "your-google-cloud-project-id" , location = "us-east4" ) ) # Create context with system message context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant in a voice conversation. Keep responses concise." } ] ) # Create context aggregator for message handling context_aggregator = llm.create_context_aggregator(context) # Set up pipeline pipeline = Pipeline([ transport.input(), context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) # Create and configure task task = PipelineTask( pipeline, params = PipelineParams( allow_interruptions = True , enable_metrics = True , enable_usage_metrics = True , ), ) ​ Authentication The service supports multiple authentication methods: Direct credentials string - Pass the JSON credentials as a string to the constructor Credentials file path - Provide a path to the service account JSON file Environment variable - Set GOOGLE_APPLICATION_CREDENTIALS to the path of your service account file The service automatically handles token refresh, with tokens having a 1-hour lifetime. ​ Methods See the LLM base class methods for additional functionality. ​ Function Calling This service supports function calling (also known as tool calling) through the OpenAI-compatible interface, which allows the LLM to request information from external services and APIs. Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Available Models Model Name Description google/gemini-2.0-flash-001 Fast, efficient text generation model google/gemini-2.0-pro-001 Comprehensive, high-quality model google/gemini-1.5-pro-001 Versatile multimodal model google/gemini-1.5-flash-001 Fast, efficient multimodal model See Google Vertex AI documentation for a complete list of supported models and their capabilities. ​ Frame Flow Inherits the OpenAI LLM Service frame flow: ​ Metrics Support The service collects standard LLM metrics: Token usage (prompt and completion) Processing duration Time to First Byte (TTFB) Function call metrics ​ Notes Uses Google Cloud’s Vertex AI API Maintains OpenAI-compatible interface Supports streaming responses Handles function calling Manages conversation context Includes token usage tracking Thread-safe processing Automatic token refresh Requires Google Cloud project setup Google Gemini Grok On this page Overview Installation Configuration Constructor Parameters Input Parameters Usage Example Authentication Methods Function Calling Available Models Frame Flow Metrics Support Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_google-vertex_f469901d.txt b/llm_google-vertex_f469901d.txt new file mode 100644 index 0000000000000000000000000000000000000000..7bc4659f5d6a920f3c3c315f176f7b22d43d72dc --- /dev/null +++ b/llm_google-vertex_f469901d.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/google-vertex#param-params +Title: Google Vertex AI - Pipecat +================================================== + +Google Vertex AI - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Google Vertex AI Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview GoogleVertexLLMService provides access to Google’s language models through Vertex AI while maintaining an OpenAI-compatible interface. It inherits from OpenAILLMService and supports all the features of the OpenAI interface while connecting to Google’s AI services. ​ Installation To use GoogleVertexLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[google]" You’ll also need to set up Google Cloud credentials. You can either: Set the GOOGLE_APPLICATION_CREDENTIALS environment variable pointing to your service account JSON file Provide credentials directly to the service constructor ​ Configuration ​ Constructor Parameters ​ credentials Optional[str] JSON string of Google service account credentials ​ credentials_path Optional[str] Path to the Google service account JSON file ​ model str default: "google/gemini-2.0-flash-001" Model identifier ​ params InputParams Vertex AI specific parameters ​ Input Parameters Extends the OpenAI input parameters with Vertex AI specific options: ​ location str default: "us-east4" Google Cloud region where the model is deployed ​ project_id str required Google Cloud project ID Also inherits all OpenAI-compatible parameters: ​ frequency_penalty Optional[float] Reduces likelihood of repeating tokens based on their frequency. Range: [-2.0, 2.0] ​ max_tokens Optional[int] Maximum number of tokens to generate. Must be greater than or equal to 1 ​ presence_penalty Optional[float] Reduces likelihood of repeating any tokens that have appeared. Range: [-2.0, 2.0] ​ temperature Optional[float] Controls randomness in the output. Range: [0.0, 2.0] ​ top_p Optional[float] Controls diversity via nucleus sampling. Range: [0.0, 1.0] ​ Usage Example Copy Ask AI from pipecat.services.google.llm_vertex import GoogleVertexLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.task import PipelineParams, PipelineTask # Configure service llm = GoogleVertexLLMService( credentials_path = "/path/to/service-account.json" , model = "google/gemini-2.0-flash-001" , params = GoogleVertexLLMService.InputParams( project_id = "your-google-cloud-project-id" , location = "us-east4" ) ) # Create context with system message context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant in a voice conversation. Keep responses concise." } ] ) # Create context aggregator for message handling context_aggregator = llm.create_context_aggregator(context) # Set up pipeline pipeline = Pipeline([ transport.input(), context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) # Create and configure task task = PipelineTask( pipeline, params = PipelineParams( allow_interruptions = True , enable_metrics = True , enable_usage_metrics = True , ), ) ​ Authentication The service supports multiple authentication methods: Direct credentials string - Pass the JSON credentials as a string to the constructor Credentials file path - Provide a path to the service account JSON file Environment variable - Set GOOGLE_APPLICATION_CREDENTIALS to the path of your service account file The service automatically handles token refresh, with tokens having a 1-hour lifetime. ​ Methods See the LLM base class methods for additional functionality. ​ Function Calling This service supports function calling (also known as tool calling) through the OpenAI-compatible interface, which allows the LLM to request information from external services and APIs. Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Available Models Model Name Description google/gemini-2.0-flash-001 Fast, efficient text generation model google/gemini-2.0-pro-001 Comprehensive, high-quality model google/gemini-1.5-pro-001 Versatile multimodal model google/gemini-1.5-flash-001 Fast, efficient multimodal model See Google Vertex AI documentation for a complete list of supported models and their capabilities. ​ Frame Flow Inherits the OpenAI LLM Service frame flow: ​ Metrics Support The service collects standard LLM metrics: Token usage (prompt and completion) Processing duration Time to First Byte (TTFB) Function call metrics ​ Notes Uses Google Cloud’s Vertex AI API Maintains OpenAI-compatible interface Supports streaming responses Handles function calling Manages conversation context Includes token usage tracking Thread-safe processing Automatic token refresh Requires Google Cloud project setup Google Gemini Grok On this page Overview Installation Configuration Constructor Parameters Input Parameters Usage Example Authentication Methods Function Calling Available Models Frame Flow Metrics Support Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_grok_a3bc95ee.txt b/llm_grok_a3bc95ee.txt new file mode 100644 index 0000000000000000000000000000000000000000..953575bdb5f5bb4aa12f9a8911205572ddbdc91c --- /dev/null +++ b/llm_grok_a3bc95ee.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/grok#function-calling +Title: Grok - Pipecat +================================================== + +Grok - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Grok Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview GrokLLMService provides access to Grok’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Grok Docs Official Grok API documentation and features Example Code Working example with function calling ​ Installation To use GrokLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[grok]" You’ll also need to set up your Grok API key as an environment variable: GROK_API_KEY . Get your API key from X.AI Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.grok.llm import GrokLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Grok service llm = GrokLLMService( api_key = os.getenv( "GROK_API_KEY" ), model = "grok-3-beta" , params = GrokLLMService.InputParams( temperature = 0.8 , # Higher for creative responses max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for voice interaction context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful and creative assistant in a voice conversation. Your output will be converted to audio, so avoid special characters. Respond in an engaging and helpful way while being succinct.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities with specialized token tracking: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Accumulated prompt tokens, completion tokens, and totals Grok uses incremental token reporting, so metrics are accumulated and reported at the end of each response. Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Real-time Information : Access to current events and up-to-date information Vision Capabilities : Image understanding and analysis with grok-2-vision model Google Vertex AI Groq On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_grok_b9559743.txt b/llm_grok_b9559743.txt new file mode 100644 index 0000000000000000000000000000000000000000..8e2c2f24a48319fbf7cbfff6076d20cbf00bd114 --- /dev/null +++ b/llm_grok_b9559743.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/grok#input +Title: Grok - Pipecat +================================================== + +Grok - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Grok Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview GrokLLMService provides access to Grok’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Grok Docs Official Grok API documentation and features Example Code Working example with function calling ​ Installation To use GrokLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[grok]" You’ll also need to set up your Grok API key as an environment variable: GROK_API_KEY . Get your API key from X.AI Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.grok.llm import GrokLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Grok service llm = GrokLLMService( api_key = os.getenv( "GROK_API_KEY" ), model = "grok-3-beta" , params = GrokLLMService.InputParams( temperature = 0.8 , # Higher for creative responses max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for voice interaction context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful and creative assistant in a voice conversation. Your output will be converted to audio, so avoid special characters. Respond in an engaging and helpful way while being succinct.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities with specialized token tracking: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Accumulated prompt tokens, completion tokens, and totals Grok uses incremental token reporting, so metrics are accumulated and reported at the end of each response. Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Real-time Information : Access to current events and up-to-date information Vision Capabilities : Image understanding and analysis with grok-2-vision model Google Vertex AI Groq On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_grok_d4bad360.txt b/llm_grok_d4bad360.txt new file mode 100644 index 0000000000000000000000000000000000000000..9841d14399b20207274b973c627267ce8fdadd00 --- /dev/null +++ b/llm_grok_d4bad360.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/grok#output +Title: Grok - Pipecat +================================================== + +Grok - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Grok Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview GrokLLMService provides access to Grok’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Grok Docs Official Grok API documentation and features Example Code Working example with function calling ​ Installation To use GrokLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[grok]" You’ll also need to set up your Grok API key as an environment variable: GROK_API_KEY . Get your API key from X.AI Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.grok.llm import GrokLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Grok service llm = GrokLLMService( api_key = os.getenv( "GROK_API_KEY" ), model = "grok-3-beta" , params = GrokLLMService.InputParams( temperature = 0.8 , # Higher for creative responses max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for voice interaction context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful and creative assistant in a voice conversation. Your output will be converted to audio, so avoid special characters. Respond in an engaging and helpful way while being succinct.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities with specialized token tracking: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Accumulated prompt tokens, completion tokens, and totals Grok uses incremental token reporting, so metrics are accumulated and reported at the end of each response. Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Real-time Information : Access to current events and up-to-date information Vision Capabilities : Image understanding and analysis with grok-2-vision model Google Vertex AI Groq On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_groq_08d80ef2.txt b/llm_groq_08d80ef2.txt new file mode 100644 index 0000000000000000000000000000000000000000..1704e3668bfc3d12f8eefd4dbca2cc386125a846 --- /dev/null +++ b/llm_groq_08d80ef2.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/groq#usage-example +Title: Groq - Pipecat +================================================== + +Groq - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Groq Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview GroqLLMService provides access to Groq’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Groq Docs Official Groq API documentation and features Example Code Working example with function calling ​ Installation To use Groq services, install the required dependency: Copy Ask AI pip install "pipecat-ai[groq]" You’ll also need to set up your Groq API key as an environment variable: GROQ_API_KEY . Get your API key for free from Groq Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing (select models) LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.groq.llm import GroqLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Groq service for speed llm = GroqLLMService( api_key = os.getenv( "GROQ_API_KEY" ), model = "llama-3.3-70b-versatile" , # Fast, capable model params = GroqLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for voice interaction context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant optimized for voice conversations. Keep responses concise and avoid special characters that don't work well in speech.""" } ], tools = tools ) # Create context aggregators with fast timeout for speed from pipecat.processors.aggregators.llm_response import LLMUserAggregatorParams context_aggregator = llm.create_context_aggregator( context, user_params = LLMUserAggregatorParams( aggregation_timeout = 0.05 ) # Fast aggregation ) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback for better UX @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline with Groq STT for full Groq stack pipeline = Pipeline([ transport.input(), groq_stt, # GroqSTTService for consistent ecosystem context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Ultra-low latency measurements Processing Duration - Hardware-accelerated processing times Token Usage - Prompt tokens, completion tokens, and totals Function Call Metrics - Tool usage and execution tracking Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Real-time Optimized : Ideal for conversational AI and streaming applications Open Source Models : Access to Llama, Mixtral, and other open-source models Vision Support : Select models support image understanding capabilities Free Tier : Generous free tier available for development and testing Grok NVIDIA NIM On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_groq_195f068a.txt b/llm_groq_195f068a.txt new file mode 100644 index 0000000000000000000000000000000000000000..e6c4e9b6093c7a34b06d4db23928df84b9889a7c --- /dev/null +++ b/llm_groq_195f068a.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/groq#output +Title: Groq - Pipecat +================================================== + +Groq - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Groq Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview GroqLLMService provides access to Groq’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Groq Docs Official Groq API documentation and features Example Code Working example with function calling ​ Installation To use Groq services, install the required dependency: Copy Ask AI pip install "pipecat-ai[groq]" You’ll also need to set up your Groq API key as an environment variable: GROQ_API_KEY . Get your API key for free from Groq Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing (select models) LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.groq.llm import GroqLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Groq service for speed llm = GroqLLMService( api_key = os.getenv( "GROQ_API_KEY" ), model = "llama-3.3-70b-versatile" , # Fast, capable model params = GroqLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for voice interaction context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant optimized for voice conversations. Keep responses concise and avoid special characters that don't work well in speech.""" } ], tools = tools ) # Create context aggregators with fast timeout for speed from pipecat.processors.aggregators.llm_response import LLMUserAggregatorParams context_aggregator = llm.create_context_aggregator( context, user_params = LLMUserAggregatorParams( aggregation_timeout = 0.05 ) # Fast aggregation ) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback for better UX @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline with Groq STT for full Groq stack pipeline = Pipeline([ transport.input(), groq_stt, # GroqSTTService for consistent ecosystem context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Ultra-low latency measurements Processing Duration - Hardware-accelerated processing times Token Usage - Prompt tokens, completion tokens, and totals Function Call Metrics - Tool usage and execution tracking Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Real-time Optimized : Ideal for conversational AI and streaming applications Open Source Models : Access to Llama, Mixtral, and other open-source models Vision Support : Select models support image understanding capabilities Free Tier : Generous free tier available for development and testing Grok NVIDIA NIM On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_groq_29386055.txt b/llm_groq_29386055.txt new file mode 100644 index 0000000000000000000000000000000000000000..29cb39166b1ed8a24c081cc3435c06cb3de64cab --- /dev/null +++ b/llm_groq_29386055.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/groq#overview +Title: Groq - Pipecat +================================================== + +Groq - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Groq Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview GroqLLMService provides access to Groq’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Groq Docs Official Groq API documentation and features Example Code Working example with function calling ​ Installation To use Groq services, install the required dependency: Copy Ask AI pip install "pipecat-ai[groq]" You’ll also need to set up your Groq API key as an environment variable: GROQ_API_KEY . Get your API key for free from Groq Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing (select models) LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.groq.llm import GroqLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Groq service for speed llm = GroqLLMService( api_key = os.getenv( "GROQ_API_KEY" ), model = "llama-3.3-70b-versatile" , # Fast, capable model params = GroqLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for voice interaction context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant optimized for voice conversations. Keep responses concise and avoid special characters that don't work well in speech.""" } ], tools = tools ) # Create context aggregators with fast timeout for speed from pipecat.processors.aggregators.llm_response import LLMUserAggregatorParams context_aggregator = llm.create_context_aggregator( context, user_params = LLMUserAggregatorParams( aggregation_timeout = 0.05 ) # Fast aggregation ) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback for better UX @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline with Groq STT for full Groq stack pipeline = Pipeline([ transport.input(), groq_stt, # GroqSTTService for consistent ecosystem context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Ultra-low latency measurements Processing Duration - Hardware-accelerated processing times Token Usage - Prompt tokens, completion tokens, and totals Function Call Metrics - Tool usage and execution tracking Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Real-time Optimized : Ideal for conversational AI and streaming applications Open Source Models : Access to Llama, Mixtral, and other open-source models Vision Support : Select models support image understanding capabilities Free Tier : Generous free tier available for development and testing Grok NVIDIA NIM On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_groq_84ae5ca8.txt b/llm_groq_84ae5ca8.txt new file mode 100644 index 0000000000000000000000000000000000000000..f90cc9b68f47506b1dc17c80628e3f4e4053be31 --- /dev/null +++ b/llm_groq_84ae5ca8.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/groq#installation +Title: Groq - Pipecat +================================================== + +Groq - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Groq Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview GroqLLMService provides access to Groq’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Groq Docs Official Groq API documentation and features Example Code Working example with function calling ​ Installation To use Groq services, install the required dependency: Copy Ask AI pip install "pipecat-ai[groq]" You’ll also need to set up your Groq API key as an environment variable: GROQ_API_KEY . Get your API key for free from Groq Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing (select models) LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.groq.llm import GroqLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Groq service for speed llm = GroqLLMService( api_key = os.getenv( "GROQ_API_KEY" ), model = "llama-3.3-70b-versatile" , # Fast, capable model params = GroqLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for voice interaction context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant optimized for voice conversations. Keep responses concise and avoid special characters that don't work well in speech.""" } ], tools = tools ) # Create context aggregators with fast timeout for speed from pipecat.processors.aggregators.llm_response import LLMUserAggregatorParams context_aggregator = llm.create_context_aggregator( context, user_params = LLMUserAggregatorParams( aggregation_timeout = 0.05 ) # Fast aggregation ) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback for better UX @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline with Groq STT for full Groq stack pipeline = Pipeline([ transport.input(), groq_stt, # GroqSTTService for consistent ecosystem context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Ultra-low latency measurements Processing Duration - Hardware-accelerated processing times Token Usage - Prompt tokens, completion tokens, and totals Function Call Metrics - Tool usage and execution tracking Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Real-time Optimized : Ideal for conversational AI and streaming applications Open Source Models : Access to Llama, Mixtral, and other open-source models Vision Support : Select models support image understanding capabilities Free Tier : Generous free tier available for development and testing Grok NVIDIA NIM On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_groq_c3fd026b.txt b/llm_groq_c3fd026b.txt new file mode 100644 index 0000000000000000000000000000000000000000..ef884c416349149ab97f1d95187cb6e70fc980eb --- /dev/null +++ b/llm_groq_c3fd026b.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/groq#metrics +Title: Groq - Pipecat +================================================== + +Groq - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Groq Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview GroqLLMService provides access to Groq’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Groq Docs Official Groq API documentation and features Example Code Working example with function calling ​ Installation To use Groq services, install the required dependency: Copy Ask AI pip install "pipecat-ai[groq]" You’ll also need to set up your Groq API key as an environment variable: GROQ_API_KEY . Get your API key for free from Groq Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing (select models) LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.groq.llm import GroqLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Groq service for speed llm = GroqLLMService( api_key = os.getenv( "GROQ_API_KEY" ), model = "llama-3.3-70b-versatile" , # Fast, capable model params = GroqLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for voice interaction context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant optimized for voice conversations. Keep responses concise and avoid special characters that don't work well in speech.""" } ], tools = tools ) # Create context aggregators with fast timeout for speed from pipecat.processors.aggregators.llm_response import LLMUserAggregatorParams context_aggregator = llm.create_context_aggregator( context, user_params = LLMUserAggregatorParams( aggregation_timeout = 0.05 ) # Fast aggregation ) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback for better UX @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline with Groq STT for full Groq stack pipeline = Pipeline([ transport.input(), groq_stt, # GroqSTTService for consistent ecosystem context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Ultra-low latency measurements Processing Duration - Hardware-accelerated processing times Token Usage - Prompt tokens, completion tokens, and totals Function Call Metrics - Tool usage and execution tracking Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Real-time Optimized : Ideal for conversational AI and streaming applications Open Source Models : Access to Llama, Mixtral, and other open-source models Vision Support : Select models support image understanding capabilities Free Tier : Generous free tier available for development and testing Grok NVIDIA NIM On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_groq_fefa0635.txt b/llm_groq_fefa0635.txt new file mode 100644 index 0000000000000000000000000000000000000000..4fac14b0a934531884529cdf6d25a677e6327d49 --- /dev/null +++ b/llm_groq_fefa0635.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/groq#additional-notes +Title: Groq - Pipecat +================================================== + +Groq - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Groq Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview GroqLLMService provides access to Groq’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Groq Docs Official Groq API documentation and features Example Code Working example with function calling ​ Installation To use Groq services, install the required dependency: Copy Ask AI pip install "pipecat-ai[groq]" You’ll also need to set up your Groq API key as an environment variable: GROQ_API_KEY . Get your API key for free from Groq Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing (select models) LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.groq.llm import GroqLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Groq service for speed llm = GroqLLMService( api_key = os.getenv( "GROQ_API_KEY" ), model = "llama-3.3-70b-versatile" , # Fast, capable model params = GroqLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for voice interaction context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant optimized for voice conversations. Keep responses concise and avoid special characters that don't work well in speech.""" } ], tools = tools ) # Create context aggregators with fast timeout for speed from pipecat.processors.aggregators.llm_response import LLMUserAggregatorParams context_aggregator = llm.create_context_aggregator( context, user_params = LLMUserAggregatorParams( aggregation_timeout = 0.05 ) # Fast aggregation ) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback for better UX @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline with Groq STT for full Groq stack pipeline = Pipeline([ transport.input(), groq_stt, # GroqSTTService for consistent ecosystem context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Ultra-low latency measurements Processing Duration - Hardware-accelerated processing times Token Usage - Prompt tokens, completion tokens, and totals Function Call Metrics - Tool usage and execution tracking Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Real-time Optimized : Ideal for conversational AI and streaming applications Open Source Models : Access to Llama, Mixtral, and other open-source models Vision Support : Select models support image understanding capabilities Free Tier : Generous free tier available for development and testing Grok NVIDIA NIM On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_nim_1b768057.txt b/llm_nim_1b768057.txt new file mode 100644 index 0000000000000000000000000000000000000000..cec00e2d9f8ab34dab2695e6e391bfac905a8f30 --- /dev/null +++ b/llm_nim_1b768057.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/nim#additional-notes +Title: NVIDIA NIM - Pipecat +================================================== + +NVIDIA NIM - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM NVIDIA NIM Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview NimLLMService provides access to NVIDIA’s NIM language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management, with special handling for NVIDIA’s incremental token reporting. API Reference Complete API documentation and method details NVIDIA NIM Docs Official NVIDIA NIM documentation and setup Example Code Working example with function calling ​ Installation To use NVIDIA NIM services, install the required dependencies: Copy Ask AI pip install "pipecat-ai[nim]" You’ll also need to set up your NVIDIA API key as an environment variable: NVIDIA_API_KEY . Get your API key from NVIDIA Build . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.nim.llm import NimLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure NVIDIA NIM service llm = NimLLMService( api_key = os.getenv( "NVIDIA_API_KEY" ), model = "nvidia/llama-3.1-nemotron-70b-instruct" , params = NimLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for voice context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant optimized for voice interactions. Keep responses concise and avoid special characters for better speech synthesis.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Includes specialized token usage tracking for NIM’s incremental reporting: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Tracks tokens used per request, compatible with NIM’s incremental reporting Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters NVIDIA Optimization : Hardware-accelerated inference on NVIDIA infrastructure Token Reporting : Custom handling for NIM’s incremental vs. OpenAI’s final token reporting Model Variety : Access to Nemotron and other NVIDIA-optimized model variants Groq Ollama On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_nim_5498dcf4.txt b/llm_nim_5498dcf4.txt new file mode 100644 index 0000000000000000000000000000000000000000..14901daa7fbb1fdc38208b838b46055462ccdbcf --- /dev/null +++ b/llm_nim_5498dcf4.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/nim#function-calling +Title: NVIDIA NIM - Pipecat +================================================== + +NVIDIA NIM - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM NVIDIA NIM Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview NimLLMService provides access to NVIDIA’s NIM language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management, with special handling for NVIDIA’s incremental token reporting. API Reference Complete API documentation and method details NVIDIA NIM Docs Official NVIDIA NIM documentation and setup Example Code Working example with function calling ​ Installation To use NVIDIA NIM services, install the required dependencies: Copy Ask AI pip install "pipecat-ai[nim]" You’ll also need to set up your NVIDIA API key as an environment variable: NVIDIA_API_KEY . Get your API key from NVIDIA Build . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.nim.llm import NimLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure NVIDIA NIM service llm = NimLLMService( api_key = os.getenv( "NVIDIA_API_KEY" ), model = "nvidia/llama-3.1-nemotron-70b-instruct" , params = NimLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for voice context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant optimized for voice interactions. Keep responses concise and avoid special characters for better speech synthesis.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Includes specialized token usage tracking for NIM’s incremental reporting: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Tracks tokens used per request, compatible with NIM’s incremental reporting Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters NVIDIA Optimization : Hardware-accelerated inference on NVIDIA infrastructure Token Reporting : Custom handling for NIM’s incremental vs. OpenAI’s final token reporting Model Variety : Access to Nemotron and other NVIDIA-optimized model variants Groq Ollama On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_nim_efaa97ed.txt b/llm_nim_efaa97ed.txt new file mode 100644 index 0000000000000000000000000000000000000000..85a7d83701cad21fe1b30238c52844e5d74cc594 --- /dev/null +++ b/llm_nim_efaa97ed.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/nim#output +Title: NVIDIA NIM - Pipecat +================================================== + +NVIDIA NIM - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM NVIDIA NIM Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview NimLLMService provides access to NVIDIA’s NIM language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management, with special handling for NVIDIA’s incremental token reporting. API Reference Complete API documentation and method details NVIDIA NIM Docs Official NVIDIA NIM documentation and setup Example Code Working example with function calling ​ Installation To use NVIDIA NIM services, install the required dependencies: Copy Ask AI pip install "pipecat-ai[nim]" You’ll also need to set up your NVIDIA API key as an environment variable: NVIDIA_API_KEY . Get your API key from NVIDIA Build . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.nim.llm import NimLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure NVIDIA NIM service llm = NimLLMService( api_key = os.getenv( "NVIDIA_API_KEY" ), model = "nvidia/llama-3.1-nemotron-70b-instruct" , params = NimLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for voice context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant optimized for voice interactions. Keep responses concise and avoid special characters for better speech synthesis.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Includes specialized token usage tracking for NIM’s incremental reporting: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Tracks tokens used per request, compatible with NIM’s incremental reporting Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters NVIDIA Optimization : Hardware-accelerated inference on NVIDIA infrastructure Token Reporting : Custom handling for NIM’s incremental vs. OpenAI’s final token reporting Model Variety : Access to Nemotron and other NVIDIA-optimized model variants Groq Ollama On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_ollama_5178a1e5.txt b/llm_ollama_5178a1e5.txt new file mode 100644 index 0000000000000000000000000000000000000000..60a45a4a558df827b6c74e0c219e938b185c8dbf --- /dev/null +++ b/llm_ollama_5178a1e5.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/ollama#context-management +Title: Ollama - Pipecat +================================================== + +Ollama - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Ollama Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview OLLamaLLMService provides access to locally-run Ollama models through an OpenAI-compatible interface. It inherits from BaseOpenAILLMService and allows you to run various open-source models locally while maintaining compatibility with OpenAI’s API format. API Reference Complete API documentation and method details Ollama Docs Official Ollama documentation and model library Download Ollama Download and setup instructions for Ollama ​ Installation To use Ollama services, you need to install both Ollama and the Pipecat dependency: Install Ollama on your system from ollama.com/download Install Pipecat dependency : Copy Ask AI pip install "pipecat-ai[ollama]" Pull a model (first time only): Copy Ask AI ollama pull llama2 Ollama runs as a local service on port 11434. No API key required! ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision models LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - Connection or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI from pipecat.services.ollama.llm import OLLamaLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure local Ollama service llm = OLLamaLLMService( model = "llama3.1" , # Must be pulled first: ollama pull llama3.1 base_url = "http://localhost:11434/v1" , # Default Ollama endpoint params = OLLamaLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for local processing weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for local model context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant running locally. Be concise and efficient in your responses while maintaining helpfulness.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler - all processing stays local async def fetch_weather ( params ): location = params.arguments[ "location" ] # Local weather lookup or cached data await params.result_callback({ "conditions" : "sunny" , "temperature" : "22°C" }) llm.register_function( "get_current_weather" , fetch_weather) # Use in pipeline - completely offline capable pipeline = Pipeline([ transport.input(), stt, # Can use local STT too context_aggregator.user(), llm, # All inference happens locally tts, # Can use local TTS too transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities for local monitoring: Time to First Byte (TTFB) - Local inference latency Processing Duration - Model execution time Token Usage - Local token counting (if supported by model) Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes Run models locally : Ollama allows you to run various open-source models on your own hardware, providing flexibility and control. OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Privacy centric : All processing happens locally, ensuring data privacy and security. NVIDIA NIM OpenAI On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_ollama_64d3ee76.txt b/llm_ollama_64d3ee76.txt new file mode 100644 index 0000000000000000000000000000000000000000..1a13f67d5cef9b8a513b46151eadc8c3e6677a58 --- /dev/null +++ b/llm_ollama_64d3ee76.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/ollama#function-calling +Title: Ollama - Pipecat +================================================== + +Ollama - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Ollama Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview OLLamaLLMService provides access to locally-run Ollama models through an OpenAI-compatible interface. It inherits from BaseOpenAILLMService and allows you to run various open-source models locally while maintaining compatibility with OpenAI’s API format. API Reference Complete API documentation and method details Ollama Docs Official Ollama documentation and model library Download Ollama Download and setup instructions for Ollama ​ Installation To use Ollama services, you need to install both Ollama and the Pipecat dependency: Install Ollama on your system from ollama.com/download Install Pipecat dependency : Copy Ask AI pip install "pipecat-ai[ollama]" Pull a model (first time only): Copy Ask AI ollama pull llama2 Ollama runs as a local service on port 11434. No API key required! ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision models LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - Connection or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI from pipecat.services.ollama.llm import OLLamaLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure local Ollama service llm = OLLamaLLMService( model = "llama3.1" , # Must be pulled first: ollama pull llama3.1 base_url = "http://localhost:11434/v1" , # Default Ollama endpoint params = OLLamaLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for local processing weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for local model context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant running locally. Be concise and efficient in your responses while maintaining helpfulness.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler - all processing stays local async def fetch_weather ( params ): location = params.arguments[ "location" ] # Local weather lookup or cached data await params.result_callback({ "conditions" : "sunny" , "temperature" : "22°C" }) llm.register_function( "get_current_weather" , fetch_weather) # Use in pipeline - completely offline capable pipeline = Pipeline([ transport.input(), stt, # Can use local STT too context_aggregator.user(), llm, # All inference happens locally tts, # Can use local TTS too transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities for local monitoring: Time to First Byte (TTFB) - Local inference latency Processing Duration - Model execution time Token Usage - Local token counting (if supported by model) Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes Run models locally : Ollama allows you to run various open-source models on your own hardware, providing flexibility and control. OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Privacy centric : All processing happens locally, ensuring data privacy and security. NVIDIA NIM OpenAI On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_ollama_9960d1f5.txt b/llm_ollama_9960d1f5.txt new file mode 100644 index 0000000000000000000000000000000000000000..3c66eb9495ddeb36dcbfedd947a5fc445f65de6e --- /dev/null +++ b/llm_ollama_9960d1f5.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/ollama#installation +Title: Ollama - Pipecat +================================================== + +Ollama - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Ollama Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview OLLamaLLMService provides access to locally-run Ollama models through an OpenAI-compatible interface. It inherits from BaseOpenAILLMService and allows you to run various open-source models locally while maintaining compatibility with OpenAI’s API format. API Reference Complete API documentation and method details Ollama Docs Official Ollama documentation and model library Download Ollama Download and setup instructions for Ollama ​ Installation To use Ollama services, you need to install both Ollama and the Pipecat dependency: Install Ollama on your system from ollama.com/download Install Pipecat dependency : Copy Ask AI pip install "pipecat-ai[ollama]" Pull a model (first time only): Copy Ask AI ollama pull llama2 Ollama runs as a local service on port 11434. No API key required! ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision models LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - Connection or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI from pipecat.services.ollama.llm import OLLamaLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure local Ollama service llm = OLLamaLLMService( model = "llama3.1" , # Must be pulled first: ollama pull llama3.1 base_url = "http://localhost:11434/v1" , # Default Ollama endpoint params = OLLamaLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for local processing weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for local model context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant running locally. Be concise and efficient in your responses while maintaining helpfulness.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler - all processing stays local async def fetch_weather ( params ): location = params.arguments[ "location" ] # Local weather lookup or cached data await params.result_callback({ "conditions" : "sunny" , "temperature" : "22°C" }) llm.register_function( "get_current_weather" , fetch_weather) # Use in pipeline - completely offline capable pipeline = Pipeline([ transport.input(), stt, # Can use local STT too context_aggregator.user(), llm, # All inference happens locally tts, # Can use local TTS too transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities for local monitoring: Time to First Byte (TTFB) - Local inference latency Processing Duration - Model execution time Token Usage - Local token counting (if supported by model) Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes Run models locally : Ollama allows you to run various open-source models on your own hardware, providing flexibility and control. OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Privacy centric : All processing happens locally, ensuring data privacy and security. NVIDIA NIM OpenAI On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_ollama_a251200b.txt b/llm_ollama_a251200b.txt new file mode 100644 index 0000000000000000000000000000000000000000..78d1b317022f42069bd80abd0bfb19b5ed72722d --- /dev/null +++ b/llm_ollama_a251200b.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/ollama#usage-example +Title: Ollama - Pipecat +================================================== + +Ollama - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Ollama Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview OLLamaLLMService provides access to locally-run Ollama models through an OpenAI-compatible interface. It inherits from BaseOpenAILLMService and allows you to run various open-source models locally while maintaining compatibility with OpenAI’s API format. API Reference Complete API documentation and method details Ollama Docs Official Ollama documentation and model library Download Ollama Download and setup instructions for Ollama ​ Installation To use Ollama services, you need to install both Ollama and the Pipecat dependency: Install Ollama on your system from ollama.com/download Install Pipecat dependency : Copy Ask AI pip install "pipecat-ai[ollama]" Pull a model (first time only): Copy Ask AI ollama pull llama2 Ollama runs as a local service on port 11434. No API key required! ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision models LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - Connection or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI from pipecat.services.ollama.llm import OLLamaLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure local Ollama service llm = OLLamaLLMService( model = "llama3.1" , # Must be pulled first: ollama pull llama3.1 base_url = "http://localhost:11434/v1" , # Default Ollama endpoint params = OLLamaLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for local processing weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for local model context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant running locally. Be concise and efficient in your responses while maintaining helpfulness.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler - all processing stays local async def fetch_weather ( params ): location = params.arguments[ "location" ] # Local weather lookup or cached data await params.result_callback({ "conditions" : "sunny" , "temperature" : "22°C" }) llm.register_function( "get_current_weather" , fetch_weather) # Use in pipeline - completely offline capable pipeline = Pipeline([ transport.input(), stt, # Can use local STT too context_aggregator.user(), llm, # All inference happens locally tts, # Can use local TTS too transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities for local monitoring: Time to First Byte (TTFB) - Local inference latency Processing Duration - Model execution time Token Usage - Local token counting (if supported by model) Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes Run models locally : Ollama allows you to run various open-source models on your own hardware, providing flexibility and control. OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Privacy centric : All processing happens locally, ensuring data privacy and security. NVIDIA NIM OpenAI On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_openai_15ec657b.txt b/llm_openai_15ec657b.txt new file mode 100644 index 0000000000000000000000000000000000000000..5117e07fbd4d534d249f1b7d04daae643c895682 --- /dev/null +++ b/llm_openai_15ec657b.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/openai#output +Title: OpenAI - Pipecat +================================================== + +OpenAI - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM OpenAI Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview OpenAILLMService provides chat completion capabilities using OpenAI’s API, supporting streaming responses, function calling, vision input, and advanced context management for conversational AI applications. API Reference Complete API documentation and method details OpenAI Docs Official OpenAI API documentation Example Code Function calling example with weather API ​ Installation To use OpenAI services, install the required dependencies: Copy Ask AI pip install "pipecat-ai[openai]" You’ll also need to set up your OpenAI API key as an environment variable: OPENAI_API_KEY . Get your API key from the OpenAI Platform . ​ Frames ​ Input OpenAILLMContextFrame - OpenAI-specific conversation context LLMMessagesFrame - Standard conversation messages VisionImageRawFrame - Images for vision model processing LLMUpdateSettingsFrame - Runtime model configuration updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example ​ Basic Conversation with Function Calling Copy Ask AI import os from pipecat.services.openai.llm import OpenAILLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema from pipecat.services.llm_service import FunctionCallParams # Configure the service llm = OpenAILLMService( model = "gpt-4o" , api_key = os.getenv( "OPENAI_API_KEY" ), params = OpenAILLMService.InputParams( temperature = 0.7 , ) ) # Define function schema weather_function = FunctionSchema( name = "get_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City name" } }, required = [ "location" ] ) # Create tools and context tools = ToolsSchema( standard_tools = [weather_function]) context = OpenAILLMContext( messages = [{ "role" : "system" , "content" : "You are a helpful assistant. Keep responses concise." }], tools = tools ) # Register function handler async def get_weather_handler ( params : FunctionCallParams): location = params.arguments.get( "location" ) # Call weather API here... weather_data = { "temperature" : "75°F" , "conditions" : "sunny" } await params.result_callback(weather_data) llm.register_function( "get_weather" , get_weather_handler) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), # Handles user messages llm, # Processes with OpenAI tts, transport.output(), context_aggregator.assistant() # Captures responses ]) ​ Metrics The service provides: Time to First Byte (TTFB) - Latency from request to first response token Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and total usage Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes Streaming Responses : All responses are streamed for low latency Context Persistence : Use context aggregators to maintain conversation history Error Handling : Automatic retry logic for rate limits and transient errors Compatible Services : Works with OpenAI-compatible APIs by setting base_url Ollama OpenPipe On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Basic Conversation with Function Calling Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_openai_2c6f2215.txt b/llm_openai_2c6f2215.txt new file mode 100644 index 0000000000000000000000000000000000000000..7acdd660084f29531938935c589a1874e4e4a095 --- /dev/null +++ b/llm_openai_2c6f2215.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/openai#context-management +Title: OpenAI - Pipecat +================================================== + +OpenAI - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM OpenAI Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview OpenAILLMService provides chat completion capabilities using OpenAI’s API, supporting streaming responses, function calling, vision input, and advanced context management for conversational AI applications. API Reference Complete API documentation and method details OpenAI Docs Official OpenAI API documentation Example Code Function calling example with weather API ​ Installation To use OpenAI services, install the required dependencies: Copy Ask AI pip install "pipecat-ai[openai]" You’ll also need to set up your OpenAI API key as an environment variable: OPENAI_API_KEY . Get your API key from the OpenAI Platform . ​ Frames ​ Input OpenAILLMContextFrame - OpenAI-specific conversation context LLMMessagesFrame - Standard conversation messages VisionImageRawFrame - Images for vision model processing LLMUpdateSettingsFrame - Runtime model configuration updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example ​ Basic Conversation with Function Calling Copy Ask AI import os from pipecat.services.openai.llm import OpenAILLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema from pipecat.services.llm_service import FunctionCallParams # Configure the service llm = OpenAILLMService( model = "gpt-4o" , api_key = os.getenv( "OPENAI_API_KEY" ), params = OpenAILLMService.InputParams( temperature = 0.7 , ) ) # Define function schema weather_function = FunctionSchema( name = "get_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City name" } }, required = [ "location" ] ) # Create tools and context tools = ToolsSchema( standard_tools = [weather_function]) context = OpenAILLMContext( messages = [{ "role" : "system" , "content" : "You are a helpful assistant. Keep responses concise." }], tools = tools ) # Register function handler async def get_weather_handler ( params : FunctionCallParams): location = params.arguments.get( "location" ) # Call weather API here... weather_data = { "temperature" : "75°F" , "conditions" : "sunny" } await params.result_callback(weather_data) llm.register_function( "get_weather" , get_weather_handler) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), # Handles user messages llm, # Processes with OpenAI tts, transport.output(), context_aggregator.assistant() # Captures responses ]) ​ Metrics The service provides: Time to First Byte (TTFB) - Latency from request to first response token Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and total usage Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes Streaming Responses : All responses are streamed for low latency Context Persistence : Use context aggregators to maintain conversation history Error Handling : Automatic retry logic for rate limits and transient errors Compatible Services : Works with OpenAI-compatible APIs by setting base_url Ollama OpenPipe On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Basic Conversation with Function Calling Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_openai_4fb2f175.txt b/llm_openai_4fb2f175.txt new file mode 100644 index 0000000000000000000000000000000000000000..a2e563acc340951285aba453489a8b51cfc65334 --- /dev/null +++ b/llm_openai_4fb2f175.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/openai#metrics +Title: OpenAI - Pipecat +================================================== + +OpenAI - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM OpenAI Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview OpenAILLMService provides chat completion capabilities using OpenAI’s API, supporting streaming responses, function calling, vision input, and advanced context management for conversational AI applications. API Reference Complete API documentation and method details OpenAI Docs Official OpenAI API documentation Example Code Function calling example with weather API ​ Installation To use OpenAI services, install the required dependencies: Copy Ask AI pip install "pipecat-ai[openai]" You’ll also need to set up your OpenAI API key as an environment variable: OPENAI_API_KEY . Get your API key from the OpenAI Platform . ​ Frames ​ Input OpenAILLMContextFrame - OpenAI-specific conversation context LLMMessagesFrame - Standard conversation messages VisionImageRawFrame - Images for vision model processing LLMUpdateSettingsFrame - Runtime model configuration updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example ​ Basic Conversation with Function Calling Copy Ask AI import os from pipecat.services.openai.llm import OpenAILLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema from pipecat.services.llm_service import FunctionCallParams # Configure the service llm = OpenAILLMService( model = "gpt-4o" , api_key = os.getenv( "OPENAI_API_KEY" ), params = OpenAILLMService.InputParams( temperature = 0.7 , ) ) # Define function schema weather_function = FunctionSchema( name = "get_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City name" } }, required = [ "location" ] ) # Create tools and context tools = ToolsSchema( standard_tools = [weather_function]) context = OpenAILLMContext( messages = [{ "role" : "system" , "content" : "You are a helpful assistant. Keep responses concise." }], tools = tools ) # Register function handler async def get_weather_handler ( params : FunctionCallParams): location = params.arguments.get( "location" ) # Call weather API here... weather_data = { "temperature" : "75°F" , "conditions" : "sunny" } await params.result_callback(weather_data) llm.register_function( "get_weather" , get_weather_handler) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), # Handles user messages llm, # Processes with OpenAI tts, transport.output(), context_aggregator.assistant() # Captures responses ]) ​ Metrics The service provides: Time to First Byte (TTFB) - Latency from request to first response token Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and total usage Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes Streaming Responses : All responses are streamed for low latency Context Persistence : Use context aggregators to maintain conversation history Error Handling : Automatic retry logic for rate limits and transient errors Compatible Services : Works with OpenAI-compatible APIs by setting base_url Ollama OpenPipe On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Basic Conversation with Function Calling Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_openai_e92d9b54.txt b/llm_openai_e92d9b54.txt new file mode 100644 index 0000000000000000000000000000000000000000..cae78913f07827d77312a9eaa2f31cf6482f7752 --- /dev/null +++ b/llm_openai_e92d9b54.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/openai#usage-example +Title: OpenAI - Pipecat +================================================== + +OpenAI - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM OpenAI Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview OpenAILLMService provides chat completion capabilities using OpenAI’s API, supporting streaming responses, function calling, vision input, and advanced context management for conversational AI applications. API Reference Complete API documentation and method details OpenAI Docs Official OpenAI API documentation Example Code Function calling example with weather API ​ Installation To use OpenAI services, install the required dependencies: Copy Ask AI pip install "pipecat-ai[openai]" You’ll also need to set up your OpenAI API key as an environment variable: OPENAI_API_KEY . Get your API key from the OpenAI Platform . ​ Frames ​ Input OpenAILLMContextFrame - OpenAI-specific conversation context LLMMessagesFrame - Standard conversation messages VisionImageRawFrame - Images for vision model processing LLMUpdateSettingsFrame - Runtime model configuration updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example ​ Basic Conversation with Function Calling Copy Ask AI import os from pipecat.services.openai.llm import OpenAILLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema from pipecat.services.llm_service import FunctionCallParams # Configure the service llm = OpenAILLMService( model = "gpt-4o" , api_key = os.getenv( "OPENAI_API_KEY" ), params = OpenAILLMService.InputParams( temperature = 0.7 , ) ) # Define function schema weather_function = FunctionSchema( name = "get_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City name" } }, required = [ "location" ] ) # Create tools and context tools = ToolsSchema( standard_tools = [weather_function]) context = OpenAILLMContext( messages = [{ "role" : "system" , "content" : "You are a helpful assistant. Keep responses concise." }], tools = tools ) # Register function handler async def get_weather_handler ( params : FunctionCallParams): location = params.arguments.get( "location" ) # Call weather API here... weather_data = { "temperature" : "75°F" , "conditions" : "sunny" } await params.result_callback(weather_data) llm.register_function( "get_weather" , get_weather_handler) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), # Handles user messages llm, # Processes with OpenAI tts, transport.output(), context_aggregator.assistant() # Captures responses ]) ​ Metrics The service provides: Time to First Byte (TTFB) - Latency from request to first response token Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and total usage Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes Streaming Responses : All responses are streamed for low latency Context Persistence : Use context aggregators to maintain conversation history Error Handling : Automatic retry logic for rate limits and transient errors Compatible Services : Works with OpenAI-compatible APIs by setting base_url Ollama OpenPipe On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Basic Conversation with Function Calling Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_openpipe_632aff35.txt b/llm_openpipe_632aff35.txt new file mode 100644 index 0000000000000000000000000000000000000000..439a76eaf96f791d044d7d0dcfba71f30869d79c --- /dev/null +++ b/llm_openpipe_632aff35.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/openpipe#metrics +Title: OpenPipe - Pipecat +================================================== + +OpenPipe - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM OpenPipe Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview OpenPipeLLMService extends the BaseOpenAILLMService to provide integration with OpenPipe, enabling request logging, model fine-tuning, and performance monitoring. It maintains compatibility with OpenAI’s API while adding OpenPipe’s logging and optimization capabilities. API Reference Complete API documentation and method details OpenPipe Docs Official OpenPipe API documentation and features ​ Installation To use OpenPipeLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[openpipe]" You’ll need to set up both API keys as environment variables: OPENPIPE_API_KEY - Your OpenPipe API key OPENAI_API_KEY - Your OpenAI API key Get your OpenPipe API key from OpenPipe Dashboard . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.openpipe.llm import OpenPipeLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure OpenPipe service with comprehensive logging llm = OpenPipeLLMService( model = "gpt-4o" , api_key = os.getenv( "OPENAI_API_KEY" ), openpipe_api_key = os.getenv( "OPENPIPE_API_KEY" ), tags = { "environment" : "production" , "feature" : "conversational-ai" , "deployment" : "voice-assistant" , "version" : "v1.2" }, params = OpenPipeLLMService.InputParams( temperature = 0.7 , max_completion_tokens = 1000 ) ) # Define function for monitoring tool usage weather_function = FunctionSchema( name = "get_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" } }, required = [ "location" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context with system optimization context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful voice assistant. Keep responses concise and natural for speech synthesis. All conversations are logged for quality improvement.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function with logging awareness async def get_weather ( params ): location = params.arguments[ "location" ] # Function calls are automatically logged by OpenPipe await params.result_callback( f "Weather in { location } : 72°F and sunny" ) llm.register_function( "get_weather" , get_weather) # Use in pipeline - all requests automatically logged pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, # Automatic logging happens here tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics plus OpenPipe-specific logging: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Detailed consumption tracking Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Privacy Aware : Configurable data retention and filtering policies Cost Optimization : Detailed analytics help optimize model usage and costs Fine-tuning Pipeline : Seamless transition from logging to custom model training OpenAI OpenRouter On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_openpipe_ce37cb2c.txt b/llm_openpipe_ce37cb2c.txt new file mode 100644 index 0000000000000000000000000000000000000000..320e0c1106a4c509ba4fd55877edcf8905124fc3 --- /dev/null +++ b/llm_openpipe_ce37cb2c.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/openpipe#installation +Title: OpenPipe - Pipecat +================================================== + +OpenPipe - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM OpenPipe Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview OpenPipeLLMService extends the BaseOpenAILLMService to provide integration with OpenPipe, enabling request logging, model fine-tuning, and performance monitoring. It maintains compatibility with OpenAI’s API while adding OpenPipe’s logging and optimization capabilities. API Reference Complete API documentation and method details OpenPipe Docs Official OpenPipe API documentation and features ​ Installation To use OpenPipeLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[openpipe]" You’ll need to set up both API keys as environment variables: OPENPIPE_API_KEY - Your OpenPipe API key OPENAI_API_KEY - Your OpenAI API key Get your OpenPipe API key from OpenPipe Dashboard . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.openpipe.llm import OpenPipeLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure OpenPipe service with comprehensive logging llm = OpenPipeLLMService( model = "gpt-4o" , api_key = os.getenv( "OPENAI_API_KEY" ), openpipe_api_key = os.getenv( "OPENPIPE_API_KEY" ), tags = { "environment" : "production" , "feature" : "conversational-ai" , "deployment" : "voice-assistant" , "version" : "v1.2" }, params = OpenPipeLLMService.InputParams( temperature = 0.7 , max_completion_tokens = 1000 ) ) # Define function for monitoring tool usage weather_function = FunctionSchema( name = "get_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" } }, required = [ "location" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context with system optimization context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful voice assistant. Keep responses concise and natural for speech synthesis. All conversations are logged for quality improvement.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function with logging awareness async def get_weather ( params ): location = params.arguments[ "location" ] # Function calls are automatically logged by OpenPipe await params.result_callback( f "Weather in { location } : 72°F and sunny" ) llm.register_function( "get_weather" , get_weather) # Use in pipeline - all requests automatically logged pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, # Automatic logging happens here tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics plus OpenPipe-specific logging: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Detailed consumption tracking Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Privacy Aware : Configurable data retention and filtering policies Cost Optimization : Detailed analytics help optimize model usage and costs Fine-tuning Pipeline : Seamless transition from logging to custom model training OpenAI OpenRouter On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_openpipe_de11b5ce.txt b/llm_openpipe_de11b5ce.txt new file mode 100644 index 0000000000000000000000000000000000000000..6c53fc977a61f5f580b783938a54b1b1066ae0ee --- /dev/null +++ b/llm_openpipe_de11b5ce.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/openpipe#input +Title: OpenPipe - Pipecat +================================================== + +OpenPipe - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM OpenPipe Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview OpenPipeLLMService extends the BaseOpenAILLMService to provide integration with OpenPipe, enabling request logging, model fine-tuning, and performance monitoring. It maintains compatibility with OpenAI’s API while adding OpenPipe’s logging and optimization capabilities. API Reference Complete API documentation and method details OpenPipe Docs Official OpenPipe API documentation and features ​ Installation To use OpenPipeLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[openpipe]" You’ll need to set up both API keys as environment variables: OPENPIPE_API_KEY - Your OpenPipe API key OPENAI_API_KEY - Your OpenAI API key Get your OpenPipe API key from OpenPipe Dashboard . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.openpipe.llm import OpenPipeLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure OpenPipe service with comprehensive logging llm = OpenPipeLLMService( model = "gpt-4o" , api_key = os.getenv( "OPENAI_API_KEY" ), openpipe_api_key = os.getenv( "OPENPIPE_API_KEY" ), tags = { "environment" : "production" , "feature" : "conversational-ai" , "deployment" : "voice-assistant" , "version" : "v1.2" }, params = OpenPipeLLMService.InputParams( temperature = 0.7 , max_completion_tokens = 1000 ) ) # Define function for monitoring tool usage weather_function = FunctionSchema( name = "get_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" } }, required = [ "location" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context with system optimization context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful voice assistant. Keep responses concise and natural for speech synthesis. All conversations are logged for quality improvement.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function with logging awareness async def get_weather ( params ): location = params.arguments[ "location" ] # Function calls are automatically logged by OpenPipe await params.result_callback( f "Weather in { location } : 72°F and sunny" ) llm.register_function( "get_weather" , get_weather) # Use in pipeline - all requests automatically logged pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, # Automatic logging happens here tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics plus OpenPipe-specific logging: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Detailed consumption tracking Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Privacy Aware : Configurable data retention and filtering policies Cost Optimization : Detailed analytics help optimize model usage and costs Fine-tuning Pipeline : Seamless transition from logging to custom model training OpenAI OpenRouter On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_openpipe_e19b36ca.txt b/llm_openpipe_e19b36ca.txt new file mode 100644 index 0000000000000000000000000000000000000000..79b950290485b7eb6a62393fdc3a99f0cba5465f --- /dev/null +++ b/llm_openpipe_e19b36ca.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/openpipe#additional-notes +Title: OpenPipe - Pipecat +================================================== + +OpenPipe - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM OpenPipe Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview OpenPipeLLMService extends the BaseOpenAILLMService to provide integration with OpenPipe, enabling request logging, model fine-tuning, and performance monitoring. It maintains compatibility with OpenAI’s API while adding OpenPipe’s logging and optimization capabilities. API Reference Complete API documentation and method details OpenPipe Docs Official OpenPipe API documentation and features ​ Installation To use OpenPipeLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[openpipe]" You’ll need to set up both API keys as environment variables: OPENPIPE_API_KEY - Your OpenPipe API key OPENAI_API_KEY - Your OpenAI API key Get your OpenPipe API key from OpenPipe Dashboard . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.openpipe.llm import OpenPipeLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure OpenPipe service with comprehensive logging llm = OpenPipeLLMService( model = "gpt-4o" , api_key = os.getenv( "OPENAI_API_KEY" ), openpipe_api_key = os.getenv( "OPENPIPE_API_KEY" ), tags = { "environment" : "production" , "feature" : "conversational-ai" , "deployment" : "voice-assistant" , "version" : "v1.2" }, params = OpenPipeLLMService.InputParams( temperature = 0.7 , max_completion_tokens = 1000 ) ) # Define function for monitoring tool usage weather_function = FunctionSchema( name = "get_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" } }, required = [ "location" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context with system optimization context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful voice assistant. Keep responses concise and natural for speech synthesis. All conversations are logged for quality improvement.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function with logging awareness async def get_weather ( params ): location = params.arguments[ "location" ] # Function calls are automatically logged by OpenPipe await params.result_callback( f "Weather in { location } : 72°F and sunny" ) llm.register_function( "get_weather" , get_weather) # Use in pipeline - all requests automatically logged pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, # Automatic logging happens here tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics plus OpenPipe-specific logging: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Detailed consumption tracking Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Privacy Aware : Configurable data retention and filtering policies Cost Optimization : Detailed analytics help optimize model usage and costs Fine-tuning Pipeline : Seamless transition from logging to custom model training OpenAI OpenRouter On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_openpipe_f6b50edf.txt b/llm_openpipe_f6b50edf.txt new file mode 100644 index 0000000000000000000000000000000000000000..347a5de3ad072bb5cee4fd5103e4816333dc81d6 --- /dev/null +++ b/llm_openpipe_f6b50edf.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/openpipe#context-management +Title: OpenPipe - Pipecat +================================================== + +OpenPipe - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM OpenPipe Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview OpenPipeLLMService extends the BaseOpenAILLMService to provide integration with OpenPipe, enabling request logging, model fine-tuning, and performance monitoring. It maintains compatibility with OpenAI’s API while adding OpenPipe’s logging and optimization capabilities. API Reference Complete API documentation and method details OpenPipe Docs Official OpenPipe API documentation and features ​ Installation To use OpenPipeLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[openpipe]" You’ll need to set up both API keys as environment variables: OPENPIPE_API_KEY - Your OpenPipe API key OPENAI_API_KEY - Your OpenAI API key Get your OpenPipe API key from OpenPipe Dashboard . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.openpipe.llm import OpenPipeLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure OpenPipe service with comprehensive logging llm = OpenPipeLLMService( model = "gpt-4o" , api_key = os.getenv( "OPENAI_API_KEY" ), openpipe_api_key = os.getenv( "OPENPIPE_API_KEY" ), tags = { "environment" : "production" , "feature" : "conversational-ai" , "deployment" : "voice-assistant" , "version" : "v1.2" }, params = OpenPipeLLMService.InputParams( temperature = 0.7 , max_completion_tokens = 1000 ) ) # Define function for monitoring tool usage weather_function = FunctionSchema( name = "get_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" } }, required = [ "location" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context with system optimization context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful voice assistant. Keep responses concise and natural for speech synthesis. All conversations are logged for quality improvement.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function with logging awareness async def get_weather ( params ): location = params.arguments[ "location" ] # Function calls are automatically logged by OpenPipe await params.result_callback( f "Weather in { location } : 72°F and sunny" ) llm.register_function( "get_weather" , get_weather) # Use in pipeline - all requests automatically logged pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, # Automatic logging happens here tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics plus OpenPipe-specific logging: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Detailed consumption tracking Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Privacy Aware : Configurable data retention and filtering policies Cost Optimization : Detailed analytics help optimize model usage and costs Fine-tuning Pipeline : Seamless transition from logging to custom model training OpenAI OpenRouter On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_openrouter_254e8278.txt b/llm_openrouter_254e8278.txt new file mode 100644 index 0000000000000000000000000000000000000000..105948697c6a7329d7e564a9e32cf8b544b50a6f --- /dev/null +++ b/llm_openrouter_254e8278.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/openrouter#function-calling +Title: OpenRouter - Pipecat +================================================== + +OpenRouter - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM OpenRouter Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview OpenRouterLLMService provides access to OpenRouter’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details OpenRouter Docs Official OpenRouter API documentation and features Example Code Working example with function calling ​ Installation To use OpenRouterLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[openrouter]" You’ll also need to set up your OpenRouter API key as an environment variable: OPENROUTER_API_KEY . Get your API key from OpenRouter . Free tier includes $1 of credits. ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing (select models) LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.openrouter.llm import OpenRouterLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure OpenRouter service llm = OpenRouterLLMService( api_key = os.getenv( "OPENROUTER_API_KEY" ), model = "openai/gpt-4o-2024-11-20" , # Easy model switching params = OpenRouterLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant optimized for voice conversations. Keep responses concise and avoid special characters for better speech synthesis.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) # Easy model switching for different use cases # llm.set_model_name("anthropic/claude-3.5-sonnet") # Switch to Claude # llm.set_model_name("meta-llama/llama-3.1-70b-instruct") # Switch to Llama ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes Model Variety : Access 70+ models from OpenAI, Anthropic, Meta, Google, and more OpenAI Compatibility : Full compatibility with existing OpenAI code Easy Switching : Change models with a single parameter update Fallback Support : Built-in model fallbacks for high availability OpenPipe Perplexity On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_openrouter_89889cb1.txt b/llm_openrouter_89889cb1.txt new file mode 100644 index 0000000000000000000000000000000000000000..e2d39aad485507742cb833698136085626ab8b4f --- /dev/null +++ b/llm_openrouter_89889cb1.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/openrouter#overview +Title: OpenRouter - Pipecat +================================================== + +OpenRouter - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM OpenRouter Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview OpenRouterLLMService provides access to OpenRouter’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details OpenRouter Docs Official OpenRouter API documentation and features Example Code Working example with function calling ​ Installation To use OpenRouterLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[openrouter]" You’ll also need to set up your OpenRouter API key as an environment variable: OPENROUTER_API_KEY . Get your API key from OpenRouter . Free tier includes $1 of credits. ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing (select models) LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.openrouter.llm import OpenRouterLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure OpenRouter service llm = OpenRouterLLMService( api_key = os.getenv( "OPENROUTER_API_KEY" ), model = "openai/gpt-4o-2024-11-20" , # Easy model switching params = OpenRouterLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant optimized for voice conversations. Keep responses concise and avoid special characters for better speech synthesis.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) # Easy model switching for different use cases # llm.set_model_name("anthropic/claude-3.5-sonnet") # Switch to Claude # llm.set_model_name("meta-llama/llama-3.1-70b-instruct") # Switch to Llama ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes Model Variety : Access 70+ models from OpenAI, Anthropic, Meta, Google, and more OpenAI Compatibility : Full compatibility with existing OpenAI code Easy Switching : Change models with a single parameter update Fallback Support : Built-in model fallbacks for high availability OpenPipe Perplexity On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_openrouter_ddbd932b.txt b/llm_openrouter_ddbd932b.txt new file mode 100644 index 0000000000000000000000000000000000000000..e059672feae74a0e17dff4e4ff79208c56d8362a --- /dev/null +++ b/llm_openrouter_ddbd932b.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/openrouter#usage-example +Title: OpenRouter - Pipecat +================================================== + +OpenRouter - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM OpenRouter Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview OpenRouterLLMService provides access to OpenRouter’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details OpenRouter Docs Official OpenRouter API documentation and features Example Code Working example with function calling ​ Installation To use OpenRouterLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[openrouter]" You’ll also need to set up your OpenRouter API key as an environment variable: OPENROUTER_API_KEY . Get your API key from OpenRouter . Free tier includes $1 of credits. ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing (select models) LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.openrouter.llm import OpenRouterLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure OpenRouter service llm = OpenRouterLLMService( api_key = os.getenv( "OPENROUTER_API_KEY" ), model = "openai/gpt-4o-2024-11-20" , # Easy model switching params = OpenRouterLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant optimized for voice conversations. Keep responses concise and avoid special characters for better speech synthesis.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) # Easy model switching for different use cases # llm.set_model_name("anthropic/claude-3.5-sonnet") # Switch to Claude # llm.set_model_name("meta-llama/llama-3.1-70b-instruct") # Switch to Llama ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes Model Variety : Access 70+ models from OpenAI, Anthropic, Meta, Google, and more OpenAI Compatibility : Full compatibility with existing OpenAI code Easy Switching : Change models with a single parameter update Fallback Support : Built-in model fallbacks for high availability OpenPipe Perplexity On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_openrouter_e5cd4522.txt b/llm_openrouter_e5cd4522.txt new file mode 100644 index 0000000000000000000000000000000000000000..5bf1e8eb9451228c6480955378cf40d5361a27b6 --- /dev/null +++ b/llm_openrouter_e5cd4522.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/openrouter#installation +Title: OpenRouter - Pipecat +================================================== + +OpenRouter - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM OpenRouter Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview OpenRouterLLMService provides access to OpenRouter’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details OpenRouter Docs Official OpenRouter API documentation and features Example Code Working example with function calling ​ Installation To use OpenRouterLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[openrouter]" You’ll also need to set up your OpenRouter API key as an environment variable: OPENROUTER_API_KEY . Get your API key from OpenRouter . Free tier includes $1 of credits. ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing (select models) LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.openrouter.llm import OpenRouterLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure OpenRouter service llm = OpenRouterLLMService( api_key = os.getenv( "OPENROUTER_API_KEY" ), model = "openai/gpt-4o-2024-11-20" , # Easy model switching params = OpenRouterLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant optimized for voice conversations. Keep responses concise and avoid special characters for better speech synthesis.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) # Easy model switching for different use cases # llm.set_model_name("anthropic/claude-3.5-sonnet") # Switch to Claude # llm.set_model_name("meta-llama/llama-3.1-70b-instruct") # Switch to Llama ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes Model Variety : Access 70+ models from OpenAI, Anthropic, Meta, Google, and more OpenAI Compatibility : Full compatibility with existing OpenAI code Easy Switching : Change models with a single parameter update Fallback Support : Built-in model fallbacks for high availability OpenPipe Perplexity On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_perplexity_2794af2a.txt b/llm_perplexity_2794af2a.txt new file mode 100644 index 0000000000000000000000000000000000000000..c7758fa261a6c5349d7d4efef1260d40da777f65 --- /dev/null +++ b/llm_perplexity_2794af2a.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/perplexity#frames +Title: Perplexity - Pipecat +================================================== + +Perplexity - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Perplexity Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview PerplexityLLMService provides access to Perplexity’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses and context management, with special handling for Perplexity’s incremental token reporting. API Reference Complete API documentation and method details Perplexity Docs Official Perplexity API documentation and features Example Code Working example with search capabilities Unlike other LLM services, Perplexity does not support function calling. Instead, they offer native internet search built in without requiring special function calls. ​ Installation To use PerplexityLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[perplexity]" You’ll also need to set up your Perplexity API key as an environment variable: PERPLEXITY_API_KEY . Get your API key from Perplexity API . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks with citations ErrorFrame - API or processing errors ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.perplexity.llm import PerplexityLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext # Configure Perplexity service llm = PerplexityLLMService( api_key = os.getenv( "PERPLEXITY_API_KEY" ), model = "sonar-pro" , # Pro model for enhanced capabilities params = PerplexityLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Create context optimized for search and current information context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a knowledgeable assistant with access to real-time information. When answering questions, use your search capabilities to provide current, accurate information. Always cite your sources when possible. Keep responses concise for voice output.""" } ] ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Use in pipeline for information-rich conversations pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, # Will automatically search and cite sources tts, transport.output(), context_aggregator.assistant() ]) # Enable metrics with special TTFB reporting for Perplexity task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True , report_only_initial_ttfb = True , # Optimized for Perplexity's response pattern ) ) ​ Metrics The service provides specialized token tracking for Perplexity’s incremental reporting: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Accumulated prompt and completion tokens Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True , ) ) ​ Additional Notes No Function Calling : Perplexity doesn’t support traditional function calling but provides superior built-in search Real-time Data : Access to current information without complex function orchestration Source Citations : Automatic citation of web sources in responses OpenAI Compatible : Uses familiar OpenAI-style interface and parameters OpenRouter Qwen On this page Overview Installation Frames Input Output Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_perplexity_3e2c950b.txt b/llm_perplexity_3e2c950b.txt new file mode 100644 index 0000000000000000000000000000000000000000..32265c17cc320a2dc44a47b35f947169c27e7d99 --- /dev/null +++ b/llm_perplexity_3e2c950b.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/perplexity#input +Title: Perplexity - Pipecat +================================================== + +Perplexity - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Perplexity Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview PerplexityLLMService provides access to Perplexity’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses and context management, with special handling for Perplexity’s incremental token reporting. API Reference Complete API documentation and method details Perplexity Docs Official Perplexity API documentation and features Example Code Working example with search capabilities Unlike other LLM services, Perplexity does not support function calling. Instead, they offer native internet search built in without requiring special function calls. ​ Installation To use PerplexityLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[perplexity]" You’ll also need to set up your Perplexity API key as an environment variable: PERPLEXITY_API_KEY . Get your API key from Perplexity API . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks with citations ErrorFrame - API or processing errors ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.perplexity.llm import PerplexityLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext # Configure Perplexity service llm = PerplexityLLMService( api_key = os.getenv( "PERPLEXITY_API_KEY" ), model = "sonar-pro" , # Pro model for enhanced capabilities params = PerplexityLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Create context optimized for search and current information context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a knowledgeable assistant with access to real-time information. When answering questions, use your search capabilities to provide current, accurate information. Always cite your sources when possible. Keep responses concise for voice output.""" } ] ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Use in pipeline for information-rich conversations pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, # Will automatically search and cite sources tts, transport.output(), context_aggregator.assistant() ]) # Enable metrics with special TTFB reporting for Perplexity task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True , report_only_initial_ttfb = True , # Optimized for Perplexity's response pattern ) ) ​ Metrics The service provides specialized token tracking for Perplexity’s incremental reporting: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Accumulated prompt and completion tokens Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True , ) ) ​ Additional Notes No Function Calling : Perplexity doesn’t support traditional function calling but provides superior built-in search Real-time Data : Access to current information without complex function orchestration Source Citations : Automatic citation of web sources in responses OpenAI Compatible : Uses familiar OpenAI-style interface and parameters OpenRouter Qwen On this page Overview Installation Frames Input Output Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_perplexity_425feda1.txt b/llm_perplexity_425feda1.txt new file mode 100644 index 0000000000000000000000000000000000000000..b7e206f54a87d4ee18c5341a8ab1d37dba36c998 --- /dev/null +++ b/llm_perplexity_425feda1.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/perplexity#installation +Title: Perplexity - Pipecat +================================================== + +Perplexity - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Perplexity Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview PerplexityLLMService provides access to Perplexity’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses and context management, with special handling for Perplexity’s incremental token reporting. API Reference Complete API documentation and method details Perplexity Docs Official Perplexity API documentation and features Example Code Working example with search capabilities Unlike other LLM services, Perplexity does not support function calling. Instead, they offer native internet search built in without requiring special function calls. ​ Installation To use PerplexityLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[perplexity]" You’ll also need to set up your Perplexity API key as an environment variable: PERPLEXITY_API_KEY . Get your API key from Perplexity API . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks with citations ErrorFrame - API or processing errors ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.perplexity.llm import PerplexityLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext # Configure Perplexity service llm = PerplexityLLMService( api_key = os.getenv( "PERPLEXITY_API_KEY" ), model = "sonar-pro" , # Pro model for enhanced capabilities params = PerplexityLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Create context optimized for search and current information context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a knowledgeable assistant with access to real-time information. When answering questions, use your search capabilities to provide current, accurate information. Always cite your sources when possible. Keep responses concise for voice output.""" } ] ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Use in pipeline for information-rich conversations pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, # Will automatically search and cite sources tts, transport.output(), context_aggregator.assistant() ]) # Enable metrics with special TTFB reporting for Perplexity task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True , report_only_initial_ttfb = True , # Optimized for Perplexity's response pattern ) ) ​ Metrics The service provides specialized token tracking for Perplexity’s incremental reporting: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Accumulated prompt and completion tokens Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True , ) ) ​ Additional Notes No Function Calling : Perplexity doesn’t support traditional function calling but provides superior built-in search Real-time Data : Access to current information without complex function orchestration Source Citations : Automatic citation of web sources in responses OpenAI Compatible : Uses familiar OpenAI-style interface and parameters OpenRouter Qwen On this page Overview Installation Frames Input Output Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_perplexity_9702684b.txt b/llm_perplexity_9702684b.txt new file mode 100644 index 0000000000000000000000000000000000000000..73ad5101e778e986d7c4c9523ba00c9ee0d6c9f7 --- /dev/null +++ b/llm_perplexity_9702684b.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/perplexity#metrics +Title: Perplexity - Pipecat +================================================== + +Perplexity - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Perplexity Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview PerplexityLLMService provides access to Perplexity’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses and context management, with special handling for Perplexity’s incremental token reporting. API Reference Complete API documentation and method details Perplexity Docs Official Perplexity API documentation and features Example Code Working example with search capabilities Unlike other LLM services, Perplexity does not support function calling. Instead, they offer native internet search built in without requiring special function calls. ​ Installation To use PerplexityLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[perplexity]" You’ll also need to set up your Perplexity API key as an environment variable: PERPLEXITY_API_KEY . Get your API key from Perplexity API . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks with citations ErrorFrame - API or processing errors ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.perplexity.llm import PerplexityLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext # Configure Perplexity service llm = PerplexityLLMService( api_key = os.getenv( "PERPLEXITY_API_KEY" ), model = "sonar-pro" , # Pro model for enhanced capabilities params = PerplexityLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Create context optimized for search and current information context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a knowledgeable assistant with access to real-time information. When answering questions, use your search capabilities to provide current, accurate information. Always cite your sources when possible. Keep responses concise for voice output.""" } ] ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Use in pipeline for information-rich conversations pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, # Will automatically search and cite sources tts, transport.output(), context_aggregator.assistant() ]) # Enable metrics with special TTFB reporting for Perplexity task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True , report_only_initial_ttfb = True , # Optimized for Perplexity's response pattern ) ) ​ Metrics The service provides specialized token tracking for Perplexity’s incremental reporting: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Accumulated prompt and completion tokens Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True , ) ) ​ Additional Notes No Function Calling : Perplexity doesn’t support traditional function calling but provides superior built-in search Real-time Data : Access to current information without complex function orchestration Source Citations : Automatic citation of web sources in responses OpenAI Compatible : Uses familiar OpenAI-style interface and parameters OpenRouter Qwen On this page Overview Installation Frames Input Output Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_qwen_083591aa.txt b/llm_qwen_083591aa.txt new file mode 100644 index 0000000000000000000000000000000000000000..51010f3edf35bfa320f8dd7693e5c44457f8d8e9 --- /dev/null +++ b/llm_qwen_083591aa.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/qwen#frames +Title: Qwen - Pipecat +================================================== + +Qwen - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Qwen Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview QwenLLMService provides access to Alibaba Cloud’s Qwen language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management, with particularly strong capabilities for Chinese language processing. API Reference Complete API documentation and method details Qwen Docs Official Qwen API documentation and features Example Code Working example with function calling ​ Installation To use QwenLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[qwen]" You’ll also need to set up your DashScope API key as an environment variable: QWEN_API_KEY . Get your API key from Alibaba Cloud Model Studio . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.qwen.llm import QwenLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Qwen service llm = QwenLLMService( api_key = os.getenv( "QWEN_API_KEY" ), model = "qwen2.5-72b-instruct" , # High-quality open source model params = QwenLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and country, e.g. Beijing, China or San Francisco, USA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create bilingual context for Chinese/English support context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant in voice conversations. Keep responses concise for speech output. You can respond in Chinese when the user speaks Chinese, or English when they speak English. 你是一个语音对话助手。请保持简洁的回答以适合语音输出。 当用户用中文交流时用中文回答,用英文交流时用英文回答。""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler async def fetch_weather ( params ): location = params.arguments[ "location" ] # Return response that works well in both languages await params.result_callback({ "conditions" : "sunny" , "temperature" : "22°C" , "location" : location }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, # Consider QwenTTSService for Chinese speech transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Long Context Support : Models support up to 1M token contexts for extensive conversations Multilingual Excellence : Superior performance in Chinese with strong English capabilities Code-Switching : Seamlessly handles mixed Chinese-English conversations Alibaba Cloud Integration : Native integration with Alibaba Cloud ecosystem Perplexity SambaNova On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_qwen_99ef9e24.txt b/llm_qwen_99ef9e24.txt new file mode 100644 index 0000000000000000000000000000000000000000..7b474251403add7c8054c4cf98008fe13f9f996c --- /dev/null +++ b/llm_qwen_99ef9e24.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/qwen#function-calling +Title: Qwen - Pipecat +================================================== + +Qwen - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Qwen Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview QwenLLMService provides access to Alibaba Cloud’s Qwen language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management, with particularly strong capabilities for Chinese language processing. API Reference Complete API documentation and method details Qwen Docs Official Qwen API documentation and features Example Code Working example with function calling ​ Installation To use QwenLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[qwen]" You’ll also need to set up your DashScope API key as an environment variable: QWEN_API_KEY . Get your API key from Alibaba Cloud Model Studio . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.qwen.llm import QwenLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Qwen service llm = QwenLLMService( api_key = os.getenv( "QWEN_API_KEY" ), model = "qwen2.5-72b-instruct" , # High-quality open source model params = QwenLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and country, e.g. Beijing, China or San Francisco, USA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create bilingual context for Chinese/English support context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant in voice conversations. Keep responses concise for speech output. You can respond in Chinese when the user speaks Chinese, or English when they speak English. 你是一个语音对话助手。请保持简洁的回答以适合语音输出。 当用户用中文交流时用中文回答,用英文交流时用英文回答。""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler async def fetch_weather ( params ): location = params.arguments[ "location" ] # Return response that works well in both languages await params.result_callback({ "conditions" : "sunny" , "temperature" : "22°C" , "location" : location }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, # Consider QwenTTSService for Chinese speech transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Long Context Support : Models support up to 1M token contexts for extensive conversations Multilingual Excellence : Superior performance in Chinese with strong English capabilities Code-Switching : Seamlessly handles mixed Chinese-English conversations Alibaba Cloud Integration : Native integration with Alibaba Cloud ecosystem Perplexity SambaNova On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_qwen_9e96ac50.txt b/llm_qwen_9e96ac50.txt new file mode 100644 index 0000000000000000000000000000000000000000..dc6c32ec644dff02cd7c7a728f5400db61f1e8af --- /dev/null +++ b/llm_qwen_9e96ac50.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/qwen#input +Title: Qwen - Pipecat +================================================== + +Qwen - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Qwen Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview QwenLLMService provides access to Alibaba Cloud’s Qwen language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management, with particularly strong capabilities for Chinese language processing. API Reference Complete API documentation and method details Qwen Docs Official Qwen API documentation and features Example Code Working example with function calling ​ Installation To use QwenLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[qwen]" You’ll also need to set up your DashScope API key as an environment variable: QWEN_API_KEY . Get your API key from Alibaba Cloud Model Studio . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.qwen.llm import QwenLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Qwen service llm = QwenLLMService( api_key = os.getenv( "QWEN_API_KEY" ), model = "qwen2.5-72b-instruct" , # High-quality open source model params = QwenLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and country, e.g. Beijing, China or San Francisco, USA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create bilingual context for Chinese/English support context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant in voice conversations. Keep responses concise for speech output. You can respond in Chinese when the user speaks Chinese, or English when they speak English. 你是一个语音对话助手。请保持简洁的回答以适合语音输出。 当用户用中文交流时用中文回答,用英文交流时用英文回答。""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler async def fetch_weather ( params ): location = params.arguments[ "location" ] # Return response that works well in both languages await params.result_callback({ "conditions" : "sunny" , "temperature" : "22°C" , "location" : location }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, # Consider QwenTTSService for Chinese speech transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Long Context Support : Models support up to 1M token contexts for extensive conversations Multilingual Excellence : Superior performance in Chinese with strong English capabilities Code-Switching : Seamlessly handles mixed Chinese-English conversations Alibaba Cloud Integration : Native integration with Alibaba Cloud ecosystem Perplexity SambaNova On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_qwen_bfd520b3.txt b/llm_qwen_bfd520b3.txt new file mode 100644 index 0000000000000000000000000000000000000000..6f038284c63658c0f2ac72b79e2bd96b50349b9b --- /dev/null +++ b/llm_qwen_bfd520b3.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/qwen#metrics +Title: Qwen - Pipecat +================================================== + +Qwen - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Qwen Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview QwenLLMService provides access to Alibaba Cloud’s Qwen language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management, with particularly strong capabilities for Chinese language processing. API Reference Complete API documentation and method details Qwen Docs Official Qwen API documentation and features Example Code Working example with function calling ​ Installation To use QwenLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[qwen]" You’ll also need to set up your DashScope API key as an environment variable: QWEN_API_KEY . Get your API key from Alibaba Cloud Model Studio . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.qwen.llm import QwenLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Qwen service llm = QwenLLMService( api_key = os.getenv( "QWEN_API_KEY" ), model = "qwen2.5-72b-instruct" , # High-quality open source model params = QwenLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and country, e.g. Beijing, China or San Francisco, USA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create bilingual context for Chinese/English support context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant in voice conversations. Keep responses concise for speech output. You can respond in Chinese when the user speaks Chinese, or English when they speak English. 你是一个语音对话助手。请保持简洁的回答以适合语音输出。 当用户用中文交流时用中文回答,用英文交流时用英文回答。""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler async def fetch_weather ( params ): location = params.arguments[ "location" ] # Return response that works well in both languages await params.result_callback({ "conditions" : "sunny" , "temperature" : "22°C" , "location" : location }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, # Consider QwenTTSService for Chinese speech transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Long Context Support : Models support up to 1M token contexts for extensive conversations Multilingual Excellence : Superior performance in Chinese with strong English capabilities Code-Switching : Seamlessly handles mixed Chinese-English conversations Alibaba Cloud Integration : Native integration with Alibaba Cloud ecosystem Perplexity SambaNova On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_qwen_e90faf04.txt b/llm_qwen_e90faf04.txt new file mode 100644 index 0000000000000000000000000000000000000000..1d254806b4d62356bb7135f79a23d513eb44462c --- /dev/null +++ b/llm_qwen_e90faf04.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/qwen#context-management +Title: Qwen - Pipecat +================================================== + +Qwen - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Qwen Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview QwenLLMService provides access to Alibaba Cloud’s Qwen language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management, with particularly strong capabilities for Chinese language processing. API Reference Complete API documentation and method details Qwen Docs Official Qwen API documentation and features Example Code Working example with function calling ​ Installation To use QwenLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[qwen]" You’ll also need to set up your DashScope API key as an environment variable: QWEN_API_KEY . Get your API key from Alibaba Cloud Model Studio . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.qwen.llm import QwenLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Qwen service llm = QwenLLMService( api_key = os.getenv( "QWEN_API_KEY" ), model = "qwen2.5-72b-instruct" , # High-quality open source model params = QwenLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and country, e.g. Beijing, China or San Francisco, USA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create bilingual context for Chinese/English support context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant in voice conversations. Keep responses concise for speech output. You can respond in Chinese when the user speaks Chinese, or English when they speak English. 你是一个语音对话助手。请保持简洁的回答以适合语音输出。 当用户用中文交流时用中文回答,用英文交流时用英文回答。""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler async def fetch_weather ( params ): location = params.arguments[ "location" ] # Return response that works well in both languages await params.result_callback({ "conditions" : "sunny" , "temperature" : "22°C" , "location" : location }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, # Consider QwenTTSService for Chinese speech transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Long Context Support : Models support up to 1M token contexts for extensive conversations Multilingual Excellence : Superior performance in Chinese with strong English capabilities Code-Switching : Seamlessly handles mixed Chinese-English conversations Alibaba Cloud Integration : Native integration with Alibaba Cloud ecosystem Perplexity SambaNova On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_sambanova_12ddac0a.txt b/llm_sambanova_12ddac0a.txt new file mode 100644 index 0000000000000000000000000000000000000000..487f0fe744eccf0bf1a9876f89e52399cf4ac25f --- /dev/null +++ b/llm_sambanova_12ddac0a.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/sambanova#param-max-tokens +Title: SambaNova - Pipecat +================================================== + +SambaNova - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM SambaNova Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview SambaNovaLLMService provides access to SambaNova’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. ​ Installation To use SambaNovaLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[sambanova]" You also need to set up your SambaNova API key as an environment variable: SAMBANOVA_API_KEY . Get your SambaNova API key here . ​ Configuration ​ Constructor Parameters ​ api_key str required Your SambaNova API key ​ model str default: "Llama-4-Maverick-17B-128E-Instruct" Model identifier ​ base_url str default: "https://api.sambanova.ai/v1" SambaNova API endpoint ​ Input Parameters Inherits OpenAI-compatible parameters: ​ max_tokens Optional[int] Maximum number of tokens to generate. Must be greater than or equal to 1. ​ temperature Optional[float] Controls randomness in the output. Range: [0.0, 1.0]. ​ top_p Optional[float] Controls diversity via nucleus sampling. Range: [0.0, 1.0] ​ Usage Example Copy Ask AI from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema from pipecat.services.sambanova.llm import SambaNovaLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from openai.types.chat import ChatCompletionToolParam from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.task import PipelineParams, PipelineTask from pipecat.services.llm_service import FunctionCallParams # Configure service llm = SambaNovaLLMService( api_key 'your-sambanova-api-key' , model = 'Llama-4-Maverick-17B-128E-Instruct' , params = SambaNovaLLMService.InputParams( temperature = 0.7 , max_tokens = 1024 ), ) # Define function to call async def fetch_weather ( params : FunctionCallParams) -> Any: """Mock function that fetches the weather forcast from an API.""" await params.result_callback({ 'conditions' : 'nice' , 'temperature' : '20 Degrees Celsius' }) # Register function handlers llm.register_function( 'get_current_weather' , fetch_weather) # Define weather function using standardized schema weather_function = FunctionSchema( name = 'get_current_weather' , description = 'Get the current weather' , properties = { 'location' : { 'type' : 'string' , 'description' : 'The city and state.' , }, 'format' : { 'type' : 'string' , 'enum' : [ 'celsius' , 'fahrenheit' ], 'description' : "The temperature unit to use. Infer this from the user's location." , }, }, required = [ 'location' , 'format' ], ) # Create tools schema tools = ToolsSchema( standard_tools = [weather_function]) # Define system message messages = [ { 'role' : 'system' , 'content' : 'You are a helpful LLM in a WebRTC call. ' 'Your goal is to demonstrate your capabilities of weather forecasting in a succinct way. ' 'Introduce yourself to the user and then wait for their question. ' 'Elaborate your response into a conversational answer in a creative and helpful way. ' 'Your output will be converted to audio so do not include special characters in your answer. ' 'Once the final answer has been provided, please stop, unless the user asks another question. ' , }, ] # Create context with system message and tools context = OpenAILLMContext(messages, tools) # Context aggregator context_aggregator = llm.create_context_aggregator(context) # Create context aggregator for message handling context_aggregator = llm.create_context_aggregator(context) # Set up pipeline pipeline = Pipeline( [ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant(), ] ) # Create and configure task task = PipelineTask( pipeline, params = PipelineParams( allow_interruptions = True , enable_metrics = True , enable_usage_metrics = True , ), ) ​ Methods See the LLM base class methods for additional functionality. ​ Function Calling This service supports function calling (also known as tool calling) which allows the LLM to request information from external services and APIs. For example, you can enable your bot to: Check current weather conditions. Query databases. Access external APIs. Perform custom actions. Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Available Models Model Name Description DeepSeek-R1 deepseek-ai/DeepSeek-R1 DeepSeek-R1-Distill-Llama-70B deepseek-ai/DeepSeek-R1-Distill-Llama-70B DeepSeek-V3-0324 deepseek-ai/DeepSeek-V3-0324 Llama-4-Maverick-17B-128E-Instruct meta-llama/Llama-4-Maverick-17B-128E-Instruct Llama-4-Scout-17B-16E-Instruct meta-llama/Llama-4-Scout-17B-16E-Instruct Meta-Llama-3.3-70B-Instruct meta-llama/Llama-3.3-70B-Instruct Meta-Llama-3.2-3B-Instruct meta-llama/Llama-3.2-3B-Instruct Meta-Llama-3.2-1B-Instruct meta-llama/Llama-3.2-1B-Instruct Meta-Llama-3.1-405B-Instruct meta-llama/Llama-3.1-405B-Instruct Meta-Llama-3.1-8B-Instruct meta-llama/Llama-3.1-8B-Instruct Meta-Llama-Guard-3-8B meta-llama/Llama-Guard-3-8B QwQ-32B Qwen/QwQ-32B Qwen3-32B Qwen/Qwen3-32B Llama-3.3-Swallow-70B-Instruct-v0.4 Tokyotech-llm/Llama-3.3-Swallow-70B-Instruct-v0.4 See SambaNova’s docs for a complete list of supported models. ​ Frame Flow Inherits the OpenAI LLM Service frame flow: ​ Metrics Support The service collects standard LLM metrics: Token usage (prompt and completion). Processing duration. Time to First Byte (TTFB). Function call metrics. ​ Notes OpenAI-compatible interface. Supports streaming responses. Handles function calling. Manages conversation context. Includes token usage tracking. Thread-safe processing. Automatic error handling. Qwen Together AI On this page Overview Installation Configuration Constructor Parameters Input Parameters Usage Example Methods Function Calling Available Models Frame Flow Metrics Support Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_sambanova_8ae2ced0.txt b/llm_sambanova_8ae2ced0.txt new file mode 100644 index 0000000000000000000000000000000000000000..f2e36924299d77eba9e20cfbe8c552b81de4b7ff --- /dev/null +++ b/llm_sambanova_8ae2ced0.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/sambanova#methods +Title: SambaNova - Pipecat +================================================== + +SambaNova - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM SambaNova Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview SambaNovaLLMService provides access to SambaNova’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. ​ Installation To use SambaNovaLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[sambanova]" You also need to set up your SambaNova API key as an environment variable: SAMBANOVA_API_KEY . Get your SambaNova API key here . ​ Configuration ​ Constructor Parameters ​ api_key str required Your SambaNova API key ​ model str default: "Llama-4-Maverick-17B-128E-Instruct" Model identifier ​ base_url str default: "https://api.sambanova.ai/v1" SambaNova API endpoint ​ Input Parameters Inherits OpenAI-compatible parameters: ​ max_tokens Optional[int] Maximum number of tokens to generate. Must be greater than or equal to 1. ​ temperature Optional[float] Controls randomness in the output. Range: [0.0, 1.0]. ​ top_p Optional[float] Controls diversity via nucleus sampling. Range: [0.0, 1.0] ​ Usage Example Copy Ask AI from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema from pipecat.services.sambanova.llm import SambaNovaLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from openai.types.chat import ChatCompletionToolParam from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.task import PipelineParams, PipelineTask from pipecat.services.llm_service import FunctionCallParams # Configure service llm = SambaNovaLLMService( api_key 'your-sambanova-api-key' , model = 'Llama-4-Maverick-17B-128E-Instruct' , params = SambaNovaLLMService.InputParams( temperature = 0.7 , max_tokens = 1024 ), ) # Define function to call async def fetch_weather ( params : FunctionCallParams) -> Any: """Mock function that fetches the weather forcast from an API.""" await params.result_callback({ 'conditions' : 'nice' , 'temperature' : '20 Degrees Celsius' }) # Register function handlers llm.register_function( 'get_current_weather' , fetch_weather) # Define weather function using standardized schema weather_function = FunctionSchema( name = 'get_current_weather' , description = 'Get the current weather' , properties = { 'location' : { 'type' : 'string' , 'description' : 'The city and state.' , }, 'format' : { 'type' : 'string' , 'enum' : [ 'celsius' , 'fahrenheit' ], 'description' : "The temperature unit to use. Infer this from the user's location." , }, }, required = [ 'location' , 'format' ], ) # Create tools schema tools = ToolsSchema( standard_tools = [weather_function]) # Define system message messages = [ { 'role' : 'system' , 'content' : 'You are a helpful LLM in a WebRTC call. ' 'Your goal is to demonstrate your capabilities of weather forecasting in a succinct way. ' 'Introduce yourself to the user and then wait for their question. ' 'Elaborate your response into a conversational answer in a creative and helpful way. ' 'Your output will be converted to audio so do not include special characters in your answer. ' 'Once the final answer has been provided, please stop, unless the user asks another question. ' , }, ] # Create context with system message and tools context = OpenAILLMContext(messages, tools) # Context aggregator context_aggregator = llm.create_context_aggregator(context) # Create context aggregator for message handling context_aggregator = llm.create_context_aggregator(context) # Set up pipeline pipeline = Pipeline( [ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant(), ] ) # Create and configure task task = PipelineTask( pipeline, params = PipelineParams( allow_interruptions = True , enable_metrics = True , enable_usage_metrics = True , ), ) ​ Methods See the LLM base class methods for additional functionality. ​ Function Calling This service supports function calling (also known as tool calling) which allows the LLM to request information from external services and APIs. For example, you can enable your bot to: Check current weather conditions. Query databases. Access external APIs. Perform custom actions. Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Available Models Model Name Description DeepSeek-R1 deepseek-ai/DeepSeek-R1 DeepSeek-R1-Distill-Llama-70B deepseek-ai/DeepSeek-R1-Distill-Llama-70B DeepSeek-V3-0324 deepseek-ai/DeepSeek-V3-0324 Llama-4-Maverick-17B-128E-Instruct meta-llama/Llama-4-Maverick-17B-128E-Instruct Llama-4-Scout-17B-16E-Instruct meta-llama/Llama-4-Scout-17B-16E-Instruct Meta-Llama-3.3-70B-Instruct meta-llama/Llama-3.3-70B-Instruct Meta-Llama-3.2-3B-Instruct meta-llama/Llama-3.2-3B-Instruct Meta-Llama-3.2-1B-Instruct meta-llama/Llama-3.2-1B-Instruct Meta-Llama-3.1-405B-Instruct meta-llama/Llama-3.1-405B-Instruct Meta-Llama-3.1-8B-Instruct meta-llama/Llama-3.1-8B-Instruct Meta-Llama-Guard-3-8B meta-llama/Llama-Guard-3-8B QwQ-32B Qwen/QwQ-32B Qwen3-32B Qwen/Qwen3-32B Llama-3.3-Swallow-70B-Instruct-v0.4 Tokyotech-llm/Llama-3.3-Swallow-70B-Instruct-v0.4 See SambaNova’s docs for a complete list of supported models. ​ Frame Flow Inherits the OpenAI LLM Service frame flow: ​ Metrics Support The service collects standard LLM metrics: Token usage (prompt and completion). Processing duration. Time to First Byte (TTFB). Function call metrics. ​ Notes OpenAI-compatible interface. Supports streaming responses. Handles function calling. Manages conversation context. Includes token usage tracking. Thread-safe processing. Automatic error handling. Qwen Together AI On this page Overview Installation Configuration Constructor Parameters Input Parameters Usage Example Methods Function Calling Available Models Frame Flow Metrics Support Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_sambanova_9312bf78.txt b/llm_sambanova_9312bf78.txt new file mode 100644 index 0000000000000000000000000000000000000000..d1c00736de658d4a87c35a0ce4f0930c361a2d1d --- /dev/null +++ b/llm_sambanova_9312bf78.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/sambanova#installation +Title: SambaNova - Pipecat +================================================== + +SambaNova - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM SambaNova Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview SambaNovaLLMService provides access to SambaNova’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. ​ Installation To use SambaNovaLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[sambanova]" You also need to set up your SambaNova API key as an environment variable: SAMBANOVA_API_KEY . Get your SambaNova API key here . ​ Configuration ​ Constructor Parameters ​ api_key str required Your SambaNova API key ​ model str default: "Llama-4-Maverick-17B-128E-Instruct" Model identifier ​ base_url str default: "https://api.sambanova.ai/v1" SambaNova API endpoint ​ Input Parameters Inherits OpenAI-compatible parameters: ​ max_tokens Optional[int] Maximum number of tokens to generate. Must be greater than or equal to 1. ​ temperature Optional[float] Controls randomness in the output. Range: [0.0, 1.0]. ​ top_p Optional[float] Controls diversity via nucleus sampling. Range: [0.0, 1.0] ​ Usage Example Copy Ask AI from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema from pipecat.services.sambanova.llm import SambaNovaLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from openai.types.chat import ChatCompletionToolParam from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.task import PipelineParams, PipelineTask from pipecat.services.llm_service import FunctionCallParams # Configure service llm = SambaNovaLLMService( api_key 'your-sambanova-api-key' , model = 'Llama-4-Maverick-17B-128E-Instruct' , params = SambaNovaLLMService.InputParams( temperature = 0.7 , max_tokens = 1024 ), ) # Define function to call async def fetch_weather ( params : FunctionCallParams) -> Any: """Mock function that fetches the weather forcast from an API.""" await params.result_callback({ 'conditions' : 'nice' , 'temperature' : '20 Degrees Celsius' }) # Register function handlers llm.register_function( 'get_current_weather' , fetch_weather) # Define weather function using standardized schema weather_function = FunctionSchema( name = 'get_current_weather' , description = 'Get the current weather' , properties = { 'location' : { 'type' : 'string' , 'description' : 'The city and state.' , }, 'format' : { 'type' : 'string' , 'enum' : [ 'celsius' , 'fahrenheit' ], 'description' : "The temperature unit to use. Infer this from the user's location." , }, }, required = [ 'location' , 'format' ], ) # Create tools schema tools = ToolsSchema( standard_tools = [weather_function]) # Define system message messages = [ { 'role' : 'system' , 'content' : 'You are a helpful LLM in a WebRTC call. ' 'Your goal is to demonstrate your capabilities of weather forecasting in a succinct way. ' 'Introduce yourself to the user and then wait for their question. ' 'Elaborate your response into a conversational answer in a creative and helpful way. ' 'Your output will be converted to audio so do not include special characters in your answer. ' 'Once the final answer has been provided, please stop, unless the user asks another question. ' , }, ] # Create context with system message and tools context = OpenAILLMContext(messages, tools) # Context aggregator context_aggregator = llm.create_context_aggregator(context) # Create context aggregator for message handling context_aggregator = llm.create_context_aggregator(context) # Set up pipeline pipeline = Pipeline( [ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant(), ] ) # Create and configure task task = PipelineTask( pipeline, params = PipelineParams( allow_interruptions = True , enable_metrics = True , enable_usage_metrics = True , ), ) ​ Methods See the LLM base class methods for additional functionality. ​ Function Calling This service supports function calling (also known as tool calling) which allows the LLM to request information from external services and APIs. For example, you can enable your bot to: Check current weather conditions. Query databases. Access external APIs. Perform custom actions. Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Available Models Model Name Description DeepSeek-R1 deepseek-ai/DeepSeek-R1 DeepSeek-R1-Distill-Llama-70B deepseek-ai/DeepSeek-R1-Distill-Llama-70B DeepSeek-V3-0324 deepseek-ai/DeepSeek-V3-0324 Llama-4-Maverick-17B-128E-Instruct meta-llama/Llama-4-Maverick-17B-128E-Instruct Llama-4-Scout-17B-16E-Instruct meta-llama/Llama-4-Scout-17B-16E-Instruct Meta-Llama-3.3-70B-Instruct meta-llama/Llama-3.3-70B-Instruct Meta-Llama-3.2-3B-Instruct meta-llama/Llama-3.2-3B-Instruct Meta-Llama-3.2-1B-Instruct meta-llama/Llama-3.2-1B-Instruct Meta-Llama-3.1-405B-Instruct meta-llama/Llama-3.1-405B-Instruct Meta-Llama-3.1-8B-Instruct meta-llama/Llama-3.1-8B-Instruct Meta-Llama-Guard-3-8B meta-llama/Llama-Guard-3-8B QwQ-32B Qwen/QwQ-32B Qwen3-32B Qwen/Qwen3-32B Llama-3.3-Swallow-70B-Instruct-v0.4 Tokyotech-llm/Llama-3.3-Swallow-70B-Instruct-v0.4 See SambaNova’s docs for a complete list of supported models. ​ Frame Flow Inherits the OpenAI LLM Service frame flow: ​ Metrics Support The service collects standard LLM metrics: Token usage (prompt and completion). Processing duration. Time to First Byte (TTFB). Function call metrics. ​ Notes OpenAI-compatible interface. Supports streaming responses. Handles function calling. Manages conversation context. Includes token usage tracking. Thread-safe processing. Automatic error handling. Qwen Together AI On this page Overview Installation Configuration Constructor Parameters Input Parameters Usage Example Methods Function Calling Available Models Frame Flow Metrics Support Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_sambanova_9636e3f2.txt b/llm_sambanova_9636e3f2.txt new file mode 100644 index 0000000000000000000000000000000000000000..d7655c2a96719efd73f4b32a6f0285af66deb286 --- /dev/null +++ b/llm_sambanova_9636e3f2.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/sambanova#metrics-support +Title: SambaNova - Pipecat +================================================== + +SambaNova - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM SambaNova Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview SambaNovaLLMService provides access to SambaNova’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. ​ Installation To use SambaNovaLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[sambanova]" You also need to set up your SambaNova API key as an environment variable: SAMBANOVA_API_KEY . Get your SambaNova API key here . ​ Configuration ​ Constructor Parameters ​ api_key str required Your SambaNova API key ​ model str default: "Llama-4-Maverick-17B-128E-Instruct" Model identifier ​ base_url str default: "https://api.sambanova.ai/v1" SambaNova API endpoint ​ Input Parameters Inherits OpenAI-compatible parameters: ​ max_tokens Optional[int] Maximum number of tokens to generate. Must be greater than or equal to 1. ​ temperature Optional[float] Controls randomness in the output. Range: [0.0, 1.0]. ​ top_p Optional[float] Controls diversity via nucleus sampling. Range: [0.0, 1.0] ​ Usage Example Copy Ask AI from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema from pipecat.services.sambanova.llm import SambaNovaLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from openai.types.chat import ChatCompletionToolParam from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.task import PipelineParams, PipelineTask from pipecat.services.llm_service import FunctionCallParams # Configure service llm = SambaNovaLLMService( api_key 'your-sambanova-api-key' , model = 'Llama-4-Maverick-17B-128E-Instruct' , params = SambaNovaLLMService.InputParams( temperature = 0.7 , max_tokens = 1024 ), ) # Define function to call async def fetch_weather ( params : FunctionCallParams) -> Any: """Mock function that fetches the weather forcast from an API.""" await params.result_callback({ 'conditions' : 'nice' , 'temperature' : '20 Degrees Celsius' }) # Register function handlers llm.register_function( 'get_current_weather' , fetch_weather) # Define weather function using standardized schema weather_function = FunctionSchema( name = 'get_current_weather' , description = 'Get the current weather' , properties = { 'location' : { 'type' : 'string' , 'description' : 'The city and state.' , }, 'format' : { 'type' : 'string' , 'enum' : [ 'celsius' , 'fahrenheit' ], 'description' : "The temperature unit to use. Infer this from the user's location." , }, }, required = [ 'location' , 'format' ], ) # Create tools schema tools = ToolsSchema( standard_tools = [weather_function]) # Define system message messages = [ { 'role' : 'system' , 'content' : 'You are a helpful LLM in a WebRTC call. ' 'Your goal is to demonstrate your capabilities of weather forecasting in a succinct way. ' 'Introduce yourself to the user and then wait for their question. ' 'Elaborate your response into a conversational answer in a creative and helpful way. ' 'Your output will be converted to audio so do not include special characters in your answer. ' 'Once the final answer has been provided, please stop, unless the user asks another question. ' , }, ] # Create context with system message and tools context = OpenAILLMContext(messages, tools) # Context aggregator context_aggregator = llm.create_context_aggregator(context) # Create context aggregator for message handling context_aggregator = llm.create_context_aggregator(context) # Set up pipeline pipeline = Pipeline( [ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant(), ] ) # Create and configure task task = PipelineTask( pipeline, params = PipelineParams( allow_interruptions = True , enable_metrics = True , enable_usage_metrics = True , ), ) ​ Methods See the LLM base class methods for additional functionality. ​ Function Calling This service supports function calling (also known as tool calling) which allows the LLM to request information from external services and APIs. For example, you can enable your bot to: Check current weather conditions. Query databases. Access external APIs. Perform custom actions. Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Available Models Model Name Description DeepSeek-R1 deepseek-ai/DeepSeek-R1 DeepSeek-R1-Distill-Llama-70B deepseek-ai/DeepSeek-R1-Distill-Llama-70B DeepSeek-V3-0324 deepseek-ai/DeepSeek-V3-0324 Llama-4-Maverick-17B-128E-Instruct meta-llama/Llama-4-Maverick-17B-128E-Instruct Llama-4-Scout-17B-16E-Instruct meta-llama/Llama-4-Scout-17B-16E-Instruct Meta-Llama-3.3-70B-Instruct meta-llama/Llama-3.3-70B-Instruct Meta-Llama-3.2-3B-Instruct meta-llama/Llama-3.2-3B-Instruct Meta-Llama-3.2-1B-Instruct meta-llama/Llama-3.2-1B-Instruct Meta-Llama-3.1-405B-Instruct meta-llama/Llama-3.1-405B-Instruct Meta-Llama-3.1-8B-Instruct meta-llama/Llama-3.1-8B-Instruct Meta-Llama-Guard-3-8B meta-llama/Llama-Guard-3-8B QwQ-32B Qwen/QwQ-32B Qwen3-32B Qwen/Qwen3-32B Llama-3.3-Swallow-70B-Instruct-v0.4 Tokyotech-llm/Llama-3.3-Swallow-70B-Instruct-v0.4 See SambaNova’s docs for a complete list of supported models. ​ Frame Flow Inherits the OpenAI LLM Service frame flow: ​ Metrics Support The service collects standard LLM metrics: Token usage (prompt and completion). Processing duration. Time to First Byte (TTFB). Function call metrics. ​ Notes OpenAI-compatible interface. Supports streaming responses. Handles function calling. Manages conversation context. Includes token usage tracking. Thread-safe processing. Automatic error handling. Qwen Together AI On this page Overview Installation Configuration Constructor Parameters Input Parameters Usage Example Methods Function Calling Available Models Frame Flow Metrics Support Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_sambanova_cda446a7.txt b/llm_sambanova_cda446a7.txt new file mode 100644 index 0000000000000000000000000000000000000000..5c8c025e62b477eb8894cf82f23eac3c62aafb93 --- /dev/null +++ b/llm_sambanova_cda446a7.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/sambanova#available-models +Title: SambaNova - Pipecat +================================================== + +SambaNova - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM SambaNova Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview SambaNovaLLMService provides access to SambaNova’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. ​ Installation To use SambaNovaLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[sambanova]" You also need to set up your SambaNova API key as an environment variable: SAMBANOVA_API_KEY . Get your SambaNova API key here . ​ Configuration ​ Constructor Parameters ​ api_key str required Your SambaNova API key ​ model str default: "Llama-4-Maverick-17B-128E-Instruct" Model identifier ​ base_url str default: "https://api.sambanova.ai/v1" SambaNova API endpoint ​ Input Parameters Inherits OpenAI-compatible parameters: ​ max_tokens Optional[int] Maximum number of tokens to generate. Must be greater than or equal to 1. ​ temperature Optional[float] Controls randomness in the output. Range: [0.0, 1.0]. ​ top_p Optional[float] Controls diversity via nucleus sampling. Range: [0.0, 1.0] ​ Usage Example Copy Ask AI from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema from pipecat.services.sambanova.llm import SambaNovaLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from openai.types.chat import ChatCompletionToolParam from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.task import PipelineParams, PipelineTask from pipecat.services.llm_service import FunctionCallParams # Configure service llm = SambaNovaLLMService( api_key 'your-sambanova-api-key' , model = 'Llama-4-Maverick-17B-128E-Instruct' , params = SambaNovaLLMService.InputParams( temperature = 0.7 , max_tokens = 1024 ), ) # Define function to call async def fetch_weather ( params : FunctionCallParams) -> Any: """Mock function that fetches the weather forcast from an API.""" await params.result_callback({ 'conditions' : 'nice' , 'temperature' : '20 Degrees Celsius' }) # Register function handlers llm.register_function( 'get_current_weather' , fetch_weather) # Define weather function using standardized schema weather_function = FunctionSchema( name = 'get_current_weather' , description = 'Get the current weather' , properties = { 'location' : { 'type' : 'string' , 'description' : 'The city and state.' , }, 'format' : { 'type' : 'string' , 'enum' : [ 'celsius' , 'fahrenheit' ], 'description' : "The temperature unit to use. Infer this from the user's location." , }, }, required = [ 'location' , 'format' ], ) # Create tools schema tools = ToolsSchema( standard_tools = [weather_function]) # Define system message messages = [ { 'role' : 'system' , 'content' : 'You are a helpful LLM in a WebRTC call. ' 'Your goal is to demonstrate your capabilities of weather forecasting in a succinct way. ' 'Introduce yourself to the user and then wait for their question. ' 'Elaborate your response into a conversational answer in a creative and helpful way. ' 'Your output will be converted to audio so do not include special characters in your answer. ' 'Once the final answer has been provided, please stop, unless the user asks another question. ' , }, ] # Create context with system message and tools context = OpenAILLMContext(messages, tools) # Context aggregator context_aggregator = llm.create_context_aggregator(context) # Create context aggregator for message handling context_aggregator = llm.create_context_aggregator(context) # Set up pipeline pipeline = Pipeline( [ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant(), ] ) # Create and configure task task = PipelineTask( pipeline, params = PipelineParams( allow_interruptions = True , enable_metrics = True , enable_usage_metrics = True , ), ) ​ Methods See the LLM base class methods for additional functionality. ​ Function Calling This service supports function calling (also known as tool calling) which allows the LLM to request information from external services and APIs. For example, you can enable your bot to: Check current weather conditions. Query databases. Access external APIs. Perform custom actions. Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Available Models Model Name Description DeepSeek-R1 deepseek-ai/DeepSeek-R1 DeepSeek-R1-Distill-Llama-70B deepseek-ai/DeepSeek-R1-Distill-Llama-70B DeepSeek-V3-0324 deepseek-ai/DeepSeek-V3-0324 Llama-4-Maverick-17B-128E-Instruct meta-llama/Llama-4-Maverick-17B-128E-Instruct Llama-4-Scout-17B-16E-Instruct meta-llama/Llama-4-Scout-17B-16E-Instruct Meta-Llama-3.3-70B-Instruct meta-llama/Llama-3.3-70B-Instruct Meta-Llama-3.2-3B-Instruct meta-llama/Llama-3.2-3B-Instruct Meta-Llama-3.2-1B-Instruct meta-llama/Llama-3.2-1B-Instruct Meta-Llama-3.1-405B-Instruct meta-llama/Llama-3.1-405B-Instruct Meta-Llama-3.1-8B-Instruct meta-llama/Llama-3.1-8B-Instruct Meta-Llama-Guard-3-8B meta-llama/Llama-Guard-3-8B QwQ-32B Qwen/QwQ-32B Qwen3-32B Qwen/Qwen3-32B Llama-3.3-Swallow-70B-Instruct-v0.4 Tokyotech-llm/Llama-3.3-Swallow-70B-Instruct-v0.4 See SambaNova’s docs for a complete list of supported models. ​ Frame Flow Inherits the OpenAI LLM Service frame flow: ​ Metrics Support The service collects standard LLM metrics: Token usage (prompt and completion). Processing duration. Time to First Byte (TTFB). Function call metrics. ​ Notes OpenAI-compatible interface. Supports streaming responses. Handles function calling. Manages conversation context. Includes token usage tracking. Thread-safe processing. Automatic error handling. Qwen Together AI On this page Overview Installation Configuration Constructor Parameters Input Parameters Usage Example Methods Function Calling Available Models Frame Flow Metrics Support Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_sambanova_da476082.txt b/llm_sambanova_da476082.txt new file mode 100644 index 0000000000000000000000000000000000000000..1313753d73998e0e91513b1dfcf2ceb895ba85f3 --- /dev/null +++ b/llm_sambanova_da476082.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/sambanova#frame-flow +Title: SambaNova - Pipecat +================================================== + +SambaNova - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM SambaNova Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview SambaNovaLLMService provides access to SambaNova’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. ​ Installation To use SambaNovaLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[sambanova]" You also need to set up your SambaNova API key as an environment variable: SAMBANOVA_API_KEY . Get your SambaNova API key here . ​ Configuration ​ Constructor Parameters ​ api_key str required Your SambaNova API key ​ model str default: "Llama-4-Maverick-17B-128E-Instruct" Model identifier ​ base_url str default: "https://api.sambanova.ai/v1" SambaNova API endpoint ​ Input Parameters Inherits OpenAI-compatible parameters: ​ max_tokens Optional[int] Maximum number of tokens to generate. Must be greater than or equal to 1. ​ temperature Optional[float] Controls randomness in the output. Range: [0.0, 1.0]. ​ top_p Optional[float] Controls diversity via nucleus sampling. Range: [0.0, 1.0] ​ Usage Example Copy Ask AI from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema from pipecat.services.sambanova.llm import SambaNovaLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from openai.types.chat import ChatCompletionToolParam from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.task import PipelineParams, PipelineTask from pipecat.services.llm_service import FunctionCallParams # Configure service llm = SambaNovaLLMService( api_key 'your-sambanova-api-key' , model = 'Llama-4-Maverick-17B-128E-Instruct' , params = SambaNovaLLMService.InputParams( temperature = 0.7 , max_tokens = 1024 ), ) # Define function to call async def fetch_weather ( params : FunctionCallParams) -> Any: """Mock function that fetches the weather forcast from an API.""" await params.result_callback({ 'conditions' : 'nice' , 'temperature' : '20 Degrees Celsius' }) # Register function handlers llm.register_function( 'get_current_weather' , fetch_weather) # Define weather function using standardized schema weather_function = FunctionSchema( name = 'get_current_weather' , description = 'Get the current weather' , properties = { 'location' : { 'type' : 'string' , 'description' : 'The city and state.' , }, 'format' : { 'type' : 'string' , 'enum' : [ 'celsius' , 'fahrenheit' ], 'description' : "The temperature unit to use. Infer this from the user's location." , }, }, required = [ 'location' , 'format' ], ) # Create tools schema tools = ToolsSchema( standard_tools = [weather_function]) # Define system message messages = [ { 'role' : 'system' , 'content' : 'You are a helpful LLM in a WebRTC call. ' 'Your goal is to demonstrate your capabilities of weather forecasting in a succinct way. ' 'Introduce yourself to the user and then wait for their question. ' 'Elaborate your response into a conversational answer in a creative and helpful way. ' 'Your output will be converted to audio so do not include special characters in your answer. ' 'Once the final answer has been provided, please stop, unless the user asks another question. ' , }, ] # Create context with system message and tools context = OpenAILLMContext(messages, tools) # Context aggregator context_aggregator = llm.create_context_aggregator(context) # Create context aggregator for message handling context_aggregator = llm.create_context_aggregator(context) # Set up pipeline pipeline = Pipeline( [ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant(), ] ) # Create and configure task task = PipelineTask( pipeline, params = PipelineParams( allow_interruptions = True , enable_metrics = True , enable_usage_metrics = True , ), ) ​ Methods See the LLM base class methods for additional functionality. ​ Function Calling This service supports function calling (also known as tool calling) which allows the LLM to request information from external services and APIs. For example, you can enable your bot to: Check current weather conditions. Query databases. Access external APIs. Perform custom actions. Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Available Models Model Name Description DeepSeek-R1 deepseek-ai/DeepSeek-R1 DeepSeek-R1-Distill-Llama-70B deepseek-ai/DeepSeek-R1-Distill-Llama-70B DeepSeek-V3-0324 deepseek-ai/DeepSeek-V3-0324 Llama-4-Maverick-17B-128E-Instruct meta-llama/Llama-4-Maverick-17B-128E-Instruct Llama-4-Scout-17B-16E-Instruct meta-llama/Llama-4-Scout-17B-16E-Instruct Meta-Llama-3.3-70B-Instruct meta-llama/Llama-3.3-70B-Instruct Meta-Llama-3.2-3B-Instruct meta-llama/Llama-3.2-3B-Instruct Meta-Llama-3.2-1B-Instruct meta-llama/Llama-3.2-1B-Instruct Meta-Llama-3.1-405B-Instruct meta-llama/Llama-3.1-405B-Instruct Meta-Llama-3.1-8B-Instruct meta-llama/Llama-3.1-8B-Instruct Meta-Llama-Guard-3-8B meta-llama/Llama-Guard-3-8B QwQ-32B Qwen/QwQ-32B Qwen3-32B Qwen/Qwen3-32B Llama-3.3-Swallow-70B-Instruct-v0.4 Tokyotech-llm/Llama-3.3-Swallow-70B-Instruct-v0.4 See SambaNova’s docs for a complete list of supported models. ​ Frame Flow Inherits the OpenAI LLM Service frame flow: ​ Metrics Support The service collects standard LLM metrics: Token usage (prompt and completion). Processing duration. Time to First Byte (TTFB). Function call metrics. ​ Notes OpenAI-compatible interface. Supports streaming responses. Handles function calling. Manages conversation context. Includes token usage tracking. Thread-safe processing. Automatic error handling. Qwen Together AI On this page Overview Installation Configuration Constructor Parameters Input Parameters Usage Example Methods Function Calling Available Models Frame Flow Metrics Support Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_sambanova_f89434c4.txt b/llm_sambanova_f89434c4.txt new file mode 100644 index 0000000000000000000000000000000000000000..4e3f6d254cdf938d5eb7e3edc19f7a1fe41b4cfc --- /dev/null +++ b/llm_sambanova_f89434c4.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/sambanova#param-top-p +Title: SambaNova - Pipecat +================================================== + +SambaNova - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM SambaNova Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview SambaNovaLLMService provides access to SambaNova’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. ​ Installation To use SambaNovaLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[sambanova]" You also need to set up your SambaNova API key as an environment variable: SAMBANOVA_API_KEY . Get your SambaNova API key here . ​ Configuration ​ Constructor Parameters ​ api_key str required Your SambaNova API key ​ model str default: "Llama-4-Maverick-17B-128E-Instruct" Model identifier ​ base_url str default: "https://api.sambanova.ai/v1" SambaNova API endpoint ​ Input Parameters Inherits OpenAI-compatible parameters: ​ max_tokens Optional[int] Maximum number of tokens to generate. Must be greater than or equal to 1. ​ temperature Optional[float] Controls randomness in the output. Range: [0.0, 1.0]. ​ top_p Optional[float] Controls diversity via nucleus sampling. Range: [0.0, 1.0] ​ Usage Example Copy Ask AI from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema from pipecat.services.sambanova.llm import SambaNovaLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from openai.types.chat import ChatCompletionToolParam from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.task import PipelineParams, PipelineTask from pipecat.services.llm_service import FunctionCallParams # Configure service llm = SambaNovaLLMService( api_key 'your-sambanova-api-key' , model = 'Llama-4-Maverick-17B-128E-Instruct' , params = SambaNovaLLMService.InputParams( temperature = 0.7 , max_tokens = 1024 ), ) # Define function to call async def fetch_weather ( params : FunctionCallParams) -> Any: """Mock function that fetches the weather forcast from an API.""" await params.result_callback({ 'conditions' : 'nice' , 'temperature' : '20 Degrees Celsius' }) # Register function handlers llm.register_function( 'get_current_weather' , fetch_weather) # Define weather function using standardized schema weather_function = FunctionSchema( name = 'get_current_weather' , description = 'Get the current weather' , properties = { 'location' : { 'type' : 'string' , 'description' : 'The city and state.' , }, 'format' : { 'type' : 'string' , 'enum' : [ 'celsius' , 'fahrenheit' ], 'description' : "The temperature unit to use. Infer this from the user's location." , }, }, required = [ 'location' , 'format' ], ) # Create tools schema tools = ToolsSchema( standard_tools = [weather_function]) # Define system message messages = [ { 'role' : 'system' , 'content' : 'You are a helpful LLM in a WebRTC call. ' 'Your goal is to demonstrate your capabilities of weather forecasting in a succinct way. ' 'Introduce yourself to the user and then wait for their question. ' 'Elaborate your response into a conversational answer in a creative and helpful way. ' 'Your output will be converted to audio so do not include special characters in your answer. ' 'Once the final answer has been provided, please stop, unless the user asks another question. ' , }, ] # Create context with system message and tools context = OpenAILLMContext(messages, tools) # Context aggregator context_aggregator = llm.create_context_aggregator(context) # Create context aggregator for message handling context_aggregator = llm.create_context_aggregator(context) # Set up pipeline pipeline = Pipeline( [ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant(), ] ) # Create and configure task task = PipelineTask( pipeline, params = PipelineParams( allow_interruptions = True , enable_metrics = True , enable_usage_metrics = True , ), ) ​ Methods See the LLM base class methods for additional functionality. ​ Function Calling This service supports function calling (also known as tool calling) which allows the LLM to request information from external services and APIs. For example, you can enable your bot to: Check current weather conditions. Query databases. Access external APIs. Perform custom actions. Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Available Models Model Name Description DeepSeek-R1 deepseek-ai/DeepSeek-R1 DeepSeek-R1-Distill-Llama-70B deepseek-ai/DeepSeek-R1-Distill-Llama-70B DeepSeek-V3-0324 deepseek-ai/DeepSeek-V3-0324 Llama-4-Maverick-17B-128E-Instruct meta-llama/Llama-4-Maverick-17B-128E-Instruct Llama-4-Scout-17B-16E-Instruct meta-llama/Llama-4-Scout-17B-16E-Instruct Meta-Llama-3.3-70B-Instruct meta-llama/Llama-3.3-70B-Instruct Meta-Llama-3.2-3B-Instruct meta-llama/Llama-3.2-3B-Instruct Meta-Llama-3.2-1B-Instruct meta-llama/Llama-3.2-1B-Instruct Meta-Llama-3.1-405B-Instruct meta-llama/Llama-3.1-405B-Instruct Meta-Llama-3.1-8B-Instruct meta-llama/Llama-3.1-8B-Instruct Meta-Llama-Guard-3-8B meta-llama/Llama-Guard-3-8B QwQ-32B Qwen/QwQ-32B Qwen3-32B Qwen/Qwen3-32B Llama-3.3-Swallow-70B-Instruct-v0.4 Tokyotech-llm/Llama-3.3-Swallow-70B-Instruct-v0.4 See SambaNova’s docs for a complete list of supported models. ​ Frame Flow Inherits the OpenAI LLM Service frame flow: ​ Metrics Support The service collects standard LLM metrics: Token usage (prompt and completion). Processing duration. Time to First Byte (TTFB). Function call metrics. ​ Notes OpenAI-compatible interface. Supports streaming responses. Handles function calling. Manages conversation context. Includes token usage tracking. Thread-safe processing. Automatic error handling. Qwen Together AI On this page Overview Installation Configuration Constructor Parameters Input Parameters Usage Example Methods Function Calling Available Models Frame Flow Metrics Support Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_together_4c92e3fb.txt b/llm_together_4c92e3fb.txt new file mode 100644 index 0000000000000000000000000000000000000000..7c620a3e1771812670e3e43344a4953d402fee1e --- /dev/null +++ b/llm_together_4c92e3fb.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/together#frames +Title: Together AI - Pipecat +================================================== + +Together AI - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Together AI Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview TogetherLLMService provides access to Together AI’s language models, including Meta’s Llama 3.1 and 3.2 models, through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Together AI Docs Official Together AI API documentation and features Example Code Working example with function calling ​ Installation To use TogetherLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[together]" You’ll also need to set up your Together AI API key as an environment variable: TOGETHER_API_KEY . Get your API key from Together AI Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing (select models) LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.together.llm import TogetherLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Together AI service llm = TogetherLLMService( api_key = os.getenv( "TOGETHER_API_KEY" ), model = "meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo" , # Balanced performance params = TogetherLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for voice context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant in a voice conversation. Keep responses concise and avoid special characters for better speech synthesis.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Open Source Models : Access to cutting-edge open-source models like Llama Vision Support : Select models support multimodal image and text understanding Competitive Pricing : Cost-effective alternative to proprietary model APIs Flexible Scaling : Choose model size based on performance vs cost requirements SambaNova AWS Polly On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_together_5190055f.txt b/llm_together_5190055f.txt new file mode 100644 index 0000000000000000000000000000000000000000..1f718a5a52907507277790a25454a53575f7e85f --- /dev/null +++ b/llm_together_5190055f.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/together#context-management +Title: Together AI - Pipecat +================================================== + +Together AI - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Together AI Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview TogetherLLMService provides access to Together AI’s language models, including Meta’s Llama 3.1 and 3.2 models, through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Together AI Docs Official Together AI API documentation and features Example Code Working example with function calling ​ Installation To use TogetherLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[together]" You’ll also need to set up your Together AI API key as an environment variable: TOGETHER_API_KEY . Get your API key from Together AI Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing (select models) LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.together.llm import TogetherLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Together AI service llm = TogetherLLMService( api_key = os.getenv( "TOGETHER_API_KEY" ), model = "meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo" , # Balanced performance params = TogetherLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for voice context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant in a voice conversation. Keep responses concise and avoid special characters for better speech synthesis.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Open Source Models : Access to cutting-edge open-source models like Llama Vision Support : Select models support multimodal image and text understanding Competitive Pricing : Cost-effective alternative to proprietary model APIs Flexible Scaling : Choose model size based on performance vs cost requirements SambaNova AWS Polly On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_together_ccde5679.txt b/llm_together_ccde5679.txt new file mode 100644 index 0000000000000000000000000000000000000000..41a5177c1f4a87392f2e70fe197d49a197e21010 --- /dev/null +++ b/llm_together_ccde5679.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/together#usage-example +Title: Together AI - Pipecat +================================================== + +Together AI - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Together AI Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview TogetherLLMService provides access to Together AI’s language models, including Meta’s Llama 3.1 and 3.2 models, through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Together AI Docs Official Together AI API documentation and features Example Code Working example with function calling ​ Installation To use TogetherLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[together]" You’ll also need to set up your Together AI API key as an environment variable: TOGETHER_API_KEY . Get your API key from Together AI Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing (select models) LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.together.llm import TogetherLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Together AI service llm = TogetherLLMService( api_key = os.getenv( "TOGETHER_API_KEY" ), model = "meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo" , # Balanced performance params = TogetherLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for voice context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant in a voice conversation. Keep responses concise and avoid special characters for better speech synthesis.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Open Source Models : Access to cutting-edge open-source models like Llama Vision Support : Select models support multimodal image and text understanding Competitive Pricing : Cost-effective alternative to proprietary model APIs Flexible Scaling : Choose model size based on performance vs cost requirements SambaNova AWS Polly On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_together_dd62fd16.txt b/llm_together_dd62fd16.txt new file mode 100644 index 0000000000000000000000000000000000000000..1d9fa9a5ea2d815abd85270ee703c0bcf8f72ed5 --- /dev/null +++ b/llm_together_dd62fd16.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/together#additional-notes +Title: Together AI - Pipecat +================================================== + +Together AI - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Together AI Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview TogetherLLMService provides access to Together AI’s language models, including Meta’s Llama 3.1 and 3.2 models, through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Together AI Docs Official Together AI API documentation and features Example Code Working example with function calling ​ Installation To use TogetherLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[together]" You’ll also need to set up your Together AI API key as an environment variable: TOGETHER_API_KEY . Get your API key from Together AI Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing (select models) LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.together.llm import TogetherLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Together AI service llm = TogetherLLMService( api_key = os.getenv( "TOGETHER_API_KEY" ), model = "meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo" , # Balanced performance params = TogetherLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for voice context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant in a voice conversation. Keep responses concise and avoid special characters for better speech synthesis.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Open Source Models : Access to cutting-edge open-source models like Llama Vision Support : Select models support multimodal image and text understanding Competitive Pricing : Cost-effective alternative to proprietary model APIs Flexible Scaling : Choose model size based on performance vs cost requirements SambaNova AWS Polly On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_together_e7ccd69c.txt b/llm_together_e7ccd69c.txt new file mode 100644 index 0000000000000000000000000000000000000000..39a50fbd2fd934a2169c6c8e113c0b50aef37855 --- /dev/null +++ b/llm_together_e7ccd69c.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/together#function-calling +Title: Together AI - Pipecat +================================================== + +Together AI - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Together AI Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview TogetherLLMService provides access to Together AI’s language models, including Meta’s Llama 3.1 and 3.2 models, through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Together AI Docs Official Together AI API documentation and features Example Code Working example with function calling ​ Installation To use TogetherLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[together]" You’ll also need to set up your Together AI API key as an environment variable: TOGETHER_API_KEY . Get your API key from Together AI Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing (select models) LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.together.llm import TogetherLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Together AI service llm = TogetherLLMService( api_key = os.getenv( "TOGETHER_API_KEY" ), model = "meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo" , # Balanced performance params = TogetherLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for voice context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant in a voice conversation. Keep responses concise and avoid special characters for better speech synthesis.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Open Source Models : Access to cutting-edge open-source models like Llama Vision Support : Select models support multimodal image and text understanding Competitive Pricing : Cost-effective alternative to proprietary model APIs Flexible Scaling : Choose model size based on performance vs cost requirements SambaNova AWS Polly On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/llm_together_fe91267a.txt b/llm_together_fe91267a.txt new file mode 100644 index 0000000000000000000000000000000000000000..9f140efd70fe623d994960a3edf16e83d43edf55 --- /dev/null +++ b/llm_together_fe91267a.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/llm/together#overview +Title: Together AI - Pipecat +================================================== + +Together AI - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation LLM Together AI Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Anthropic AWS Bedrock Azure Cerebras DeepSeek Fireworks AI Google Gemini Google Vertex AI Grok Groq NVIDIA NIM Ollama OpenAI OpenPipe OpenRouter Perplexity Qwen SambaNova Together AI Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview TogetherLLMService provides access to Together AI’s language models, including Meta’s Llama 3.1 and 3.2 models, through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management. API Reference Complete API documentation and method details Together AI Docs Official Together AI API documentation and features Example Code Working example with function calling ​ Installation To use TogetherLLMService , install the required dependencies: Copy Ask AI pip install "pipecat-ai[together]" You’ll also need to set up your Together AI API key as an environment variable: TOGETHER_API_KEY . Get your API key from Together AI Console . ​ Frames ​ Input OpenAILLMContextFrame - Conversation context and history LLMMessagesFrame - Direct message list VisionImageRawFrame - Images for vision processing (select models) LLMUpdateSettingsFrame - Runtime parameter updates ​ Output LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries LLMTextFrame - Streamed completion chunks FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle ErrorFrame - API or processing errors ​ Function Calling Function Calling Guide Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications. ​ Context Management Context Management Guide Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences. ​ Usage Example Copy Ask AI import os from pipecat.services.together.llm import TogetherLLMService from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext from pipecat.adapters.schemas.function_schema import FunctionSchema from pipecat.adapters.schemas.tools_schema import ToolsSchema # Configure Together AI service llm = TogetherLLMService( api_key = os.getenv( "TOGETHER_API_KEY" ), model = "meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo" , # Balanced performance params = TogetherLLMService.InputParams( temperature = 0.7 , max_tokens = 1000 ) ) # Define function for tool calling weather_function = FunctionSchema( name = "get_current_weather" , description = "Get current weather information" , properties = { "location" : { "type" : "string" , "description" : "City and state, e.g. San Francisco, CA" }, "format" : { "type" : "string" , "enum" : [ "celsius" , "fahrenheit" ], "description" : "Temperature unit to use" } }, required = [ "location" , "format" ] ) tools = ToolsSchema( standard_tools = [weather_function]) # Create context optimized for voice context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : """You are a helpful assistant in a voice conversation. Keep responses concise and avoid special characters for better speech synthesis.""" } ], tools = tools ) # Create context aggregators context_aggregator = llm.create_context_aggregator(context) # Register function handler with feedback async def fetch_weather ( params ): location = params.arguments[ "location" ] await params.result_callback({ "conditions" : "sunny" , "temperature" : "75°F" }) llm.register_function( "get_current_weather" , fetch_weather) # Optional: Add function call feedback @llm.event_handler ( "on_function_calls_started" ) async def on_function_calls_started ( service , function_calls ): await tts.queue_frame(TTSSpeakFrame( "Let me check on that." )) # Use in pipeline pipeline = Pipeline([ transport.input(), stt, context_aggregator.user(), llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Metrics Inherits all OpenAI metrics capabilities: Time to First Byte (TTFB) - Response latency measurement Processing Duration - Total request processing time Token Usage - Prompt tokens, completion tokens, and totals Enable with: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( enable_metrics = True , enable_usage_metrics = True ) ) ​ Additional Notes OpenAI Compatibility : Full compatibility with OpenAI API features and parameters Open Source Models : Access to cutting-edge open-source models like Llama Vision Support : Select models support multimodal image and text understanding Competitive Pricing : Cost-effective alternative to proprietary model APIs Flexible Scaling : Choose model size based on performance vs cost requirements SambaNova AWS Polly On this page Overview Installation Frames Input Output Function Calling Context Management Usage Example Metrics Additional Notes Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/mcp_mcp_51902d7b.txt b/mcp_mcp_51902d7b.txt new file mode 100644 index 0000000000000000000000000000000000000000..3efc1b48abebc9edd97bea615a602dc55246f56c --- /dev/null +++ b/mcp_mcp_51902d7b.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/mcp/mcp#mcp-stdio-transport-implementation +Title: MCPClient - Pipecat +================================================== + +MCPClient - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation MCP MCPClient Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP MCPClient Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview MCP is an open standard for enabling AI agents to interact with external data and tools. MCPClient provides a way to access and call tools via MCP. For example, instead of writing bespoke function call implementations for an external API, you may use an MCP server that provides a bridge to the API. Be aware there may be security implications. See MCP documenation for more details. ​ Installation To use MCPClient , install the required dependencies: Copy Ask AI pip install "pipecat-ai[mcp]" You may also need to set environment variables as required by the specific MCP server to which you are connecting. ​ Configuration ​ Constructor Parameters You can connect to your MCP server via Stdio or SSE transport. See here for more documentation on MCP transports. ​ server_params str | StdioServerParameters required You can provide either: URL: “ https://your.mcp.server/sse ” StdioServerParameters, which are defined as: Copy Ask AI StdioServerParameters( command = "python" , # Executable args = [ "example_server.py" ], # Optional command line arguments env = None , # Optional environment variables ) ​ Input Parameters See more information regarding server params here . ​ Usage Example ​ MCP Stdio Transport Implementation Copy Ask AI # Import MCPClient and StdioServerParameters ... from mcp import StdioServerParameters from pipecat.services.mcp_service import MCPClient ... # Initialize an LLM llm = ... # Initialize and configure MCPClient with server parameters mcp = MCPClient( server_params = StdioServerParameters( command = shutil.which( "npx" ), args = [ "-y" , "@name/mcp-server-name@latest" ], env = { "ENV_API_KEY" : "" }, ) ) # Create tools schema from the MCP server and register them with llm tools = await mcp.register_tools(llm) # Create context with system message and tools # Tip: Let the LLM know it has access to tools from an MCP server by including it in the system prompt. context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant in a voice conversation. You have access to MCP tools. Keep responses concise." } ], tools = tools ) ​ MCP SSE Transport Implementation Copy Ask AI # Import MCPClient ... from pipecat.services.mcp_service import MCPClient ... # Initialize an LLM llm = ... # Initialize and configure MCPClient with MCP SSE server url mcp = MCPClient( server_params = "https://your.mcp.server/sse" ) # Create tools schema from the MCP server and register them with llm tools = await mcp.register_tools(llm) # Create context with system message and tools # Tip: Let the LLM know it has access to tools from an MCP server by including it in the system prompt. context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant in a voice conversation. You have access to MCP tools. Keep responses concise." } ], tools = tools ) ​ Methods ​ register_tools async method Converts MCP tools to Pipecat-friendly function definitions and registers the functions with the llm. Copy Ask AI async def register_tools ( self , llm ) -> ToolsSchema: ​ Additional documentation See MCP’s docs for MCP related updates. OpenTelemetry Observer Pattern On this page Overview Installation Configuration Constructor Parameters Input Parameters Usage Example MCP Stdio Transport Implementation MCP SSE Transport Implementation Methods Additional documentation Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/mcp_mcp_923dda2b.txt b/mcp_mcp_923dda2b.txt new file mode 100644 index 0000000000000000000000000000000000000000..0693e00e8516681d28c3702814efc8c62a8cc7e9 --- /dev/null +++ b/mcp_mcp_923dda2b.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/mcp/mcp#input-parameters +Title: MCPClient - Pipecat +================================================== + +MCPClient - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation MCP MCPClient Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP MCPClient Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview MCP is an open standard for enabling AI agents to interact with external data and tools. MCPClient provides a way to access and call tools via MCP. For example, instead of writing bespoke function call implementations for an external API, you may use an MCP server that provides a bridge to the API. Be aware there may be security implications. See MCP documenation for more details. ​ Installation To use MCPClient , install the required dependencies: Copy Ask AI pip install "pipecat-ai[mcp]" You may also need to set environment variables as required by the specific MCP server to which you are connecting. ​ Configuration ​ Constructor Parameters You can connect to your MCP server via Stdio or SSE transport. See here for more documentation on MCP transports. ​ server_params str | StdioServerParameters required You can provide either: URL: “ https://your.mcp.server/sse ” StdioServerParameters, which are defined as: Copy Ask AI StdioServerParameters( command = "python" , # Executable args = [ "example_server.py" ], # Optional command line arguments env = None , # Optional environment variables ) ​ Input Parameters See more information regarding server params here . ​ Usage Example ​ MCP Stdio Transport Implementation Copy Ask AI # Import MCPClient and StdioServerParameters ... from mcp import StdioServerParameters from pipecat.services.mcp_service import MCPClient ... # Initialize an LLM llm = ... # Initialize and configure MCPClient with server parameters mcp = MCPClient( server_params = StdioServerParameters( command = shutil.which( "npx" ), args = [ "-y" , "@name/mcp-server-name@latest" ], env = { "ENV_API_KEY" : "" }, ) ) # Create tools schema from the MCP server and register them with llm tools = await mcp.register_tools(llm) # Create context with system message and tools # Tip: Let the LLM know it has access to tools from an MCP server by including it in the system prompt. context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant in a voice conversation. You have access to MCP tools. Keep responses concise." } ], tools = tools ) ​ MCP SSE Transport Implementation Copy Ask AI # Import MCPClient ... from pipecat.services.mcp_service import MCPClient ... # Initialize an LLM llm = ... # Initialize and configure MCPClient with MCP SSE server url mcp = MCPClient( server_params = "https://your.mcp.server/sse" ) # Create tools schema from the MCP server and register them with llm tools = await mcp.register_tools(llm) # Create context with system message and tools # Tip: Let the LLM know it has access to tools from an MCP server by including it in the system prompt. context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant in a voice conversation. You have access to MCP tools. Keep responses concise." } ], tools = tools ) ​ Methods ​ register_tools async method Converts MCP tools to Pipecat-friendly function definitions and registers the functions with the llm. Copy Ask AI async def register_tools ( self , llm ) -> ToolsSchema: ​ Additional documentation See MCP’s docs for MCP related updates. OpenTelemetry Observer Pattern On this page Overview Installation Configuration Constructor Parameters Input Parameters Usage Example MCP Stdio Transport Implementation MCP SSE Transport Implementation Methods Additional documentation Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/mcp_mcp_99da9717.txt b/mcp_mcp_99da9717.txt new file mode 100644 index 0000000000000000000000000000000000000000..928895842b14bd4b87f46d91fed92bee2a88fb97 --- /dev/null +++ b/mcp_mcp_99da9717.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/mcp/mcp#installation +Title: MCPClient - Pipecat +================================================== + +MCPClient - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation MCP MCPClient Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP MCPClient Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview MCP is an open standard for enabling AI agents to interact with external data and tools. MCPClient provides a way to access and call tools via MCP. For example, instead of writing bespoke function call implementations for an external API, you may use an MCP server that provides a bridge to the API. Be aware there may be security implications. See MCP documenation for more details. ​ Installation To use MCPClient , install the required dependencies: Copy Ask AI pip install "pipecat-ai[mcp]" You may also need to set environment variables as required by the specific MCP server to which you are connecting. ​ Configuration ​ Constructor Parameters You can connect to your MCP server via Stdio or SSE transport. See here for more documentation on MCP transports. ​ server_params str | StdioServerParameters required You can provide either: URL: “ https://your.mcp.server/sse ” StdioServerParameters, which are defined as: Copy Ask AI StdioServerParameters( command = "python" , # Executable args = [ "example_server.py" ], # Optional command line arguments env = None , # Optional environment variables ) ​ Input Parameters See more information regarding server params here . ​ Usage Example ​ MCP Stdio Transport Implementation Copy Ask AI # Import MCPClient and StdioServerParameters ... from mcp import StdioServerParameters from pipecat.services.mcp_service import MCPClient ... # Initialize an LLM llm = ... # Initialize and configure MCPClient with server parameters mcp = MCPClient( server_params = StdioServerParameters( command = shutil.which( "npx" ), args = [ "-y" , "@name/mcp-server-name@latest" ], env = { "ENV_API_KEY" : "" }, ) ) # Create tools schema from the MCP server and register them with llm tools = await mcp.register_tools(llm) # Create context with system message and tools # Tip: Let the LLM know it has access to tools from an MCP server by including it in the system prompt. context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant in a voice conversation. You have access to MCP tools. Keep responses concise." } ], tools = tools ) ​ MCP SSE Transport Implementation Copy Ask AI # Import MCPClient ... from pipecat.services.mcp_service import MCPClient ... # Initialize an LLM llm = ... # Initialize and configure MCPClient with MCP SSE server url mcp = MCPClient( server_params = "https://your.mcp.server/sse" ) # Create tools schema from the MCP server and register them with llm tools = await mcp.register_tools(llm) # Create context with system message and tools # Tip: Let the LLM know it has access to tools from an MCP server by including it in the system prompt. context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant in a voice conversation. You have access to MCP tools. Keep responses concise." } ], tools = tools ) ​ Methods ​ register_tools async method Converts MCP tools to Pipecat-friendly function definitions and registers the functions with the llm. Copy Ask AI async def register_tools ( self , llm ) -> ToolsSchema: ​ Additional documentation See MCP’s docs for MCP related updates. OpenTelemetry Observer Pattern On this page Overview Installation Configuration Constructor Parameters Input Parameters Usage Example MCP Stdio Transport Implementation MCP SSE Transport Implementation Methods Additional documentation Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/mcp_mcp_e4e56e4e.txt b/mcp_mcp_e4e56e4e.txt new file mode 100644 index 0000000000000000000000000000000000000000..3c0eba0af73acde90d7cfdedf1fab772f7968651 --- /dev/null +++ b/mcp_mcp_e4e56e4e.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/mcp/mcp#configuration +Title: MCPClient - Pipecat +================================================== + +MCPClient - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation MCP MCPClient Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP MCPClient Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview MCP is an open standard for enabling AI agents to interact with external data and tools. MCPClient provides a way to access and call tools via MCP. For example, instead of writing bespoke function call implementations for an external API, you may use an MCP server that provides a bridge to the API. Be aware there may be security implications. See MCP documenation for more details. ​ Installation To use MCPClient , install the required dependencies: Copy Ask AI pip install "pipecat-ai[mcp]" You may also need to set environment variables as required by the specific MCP server to which you are connecting. ​ Configuration ​ Constructor Parameters You can connect to your MCP server via Stdio or SSE transport. See here for more documentation on MCP transports. ​ server_params str | StdioServerParameters required You can provide either: URL: “ https://your.mcp.server/sse ” StdioServerParameters, which are defined as: Copy Ask AI StdioServerParameters( command = "python" , # Executable args = [ "example_server.py" ], # Optional command line arguments env = None , # Optional environment variables ) ​ Input Parameters See more information regarding server params here . ​ Usage Example ​ MCP Stdio Transport Implementation Copy Ask AI # Import MCPClient and StdioServerParameters ... from mcp import StdioServerParameters from pipecat.services.mcp_service import MCPClient ... # Initialize an LLM llm = ... # Initialize and configure MCPClient with server parameters mcp = MCPClient( server_params = StdioServerParameters( command = shutil.which( "npx" ), args = [ "-y" , "@name/mcp-server-name@latest" ], env = { "ENV_API_KEY" : "" }, ) ) # Create tools schema from the MCP server and register them with llm tools = await mcp.register_tools(llm) # Create context with system message and tools # Tip: Let the LLM know it has access to tools from an MCP server by including it in the system prompt. context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant in a voice conversation. You have access to MCP tools. Keep responses concise." } ], tools = tools ) ​ MCP SSE Transport Implementation Copy Ask AI # Import MCPClient ... from pipecat.services.mcp_service import MCPClient ... # Initialize an LLM llm = ... # Initialize and configure MCPClient with MCP SSE server url mcp = MCPClient( server_params = "https://your.mcp.server/sse" ) # Create tools schema from the MCP server and register them with llm tools = await mcp.register_tools(llm) # Create context with system message and tools # Tip: Let the LLM know it has access to tools from an MCP server by including it in the system prompt. context = OpenAILLMContext( messages = [ { "role" : "system" , "content" : "You are a helpful assistant in a voice conversation. You have access to MCP tools. Keep responses concise." } ], tools = tools ) ​ Methods ​ register_tools async method Converts MCP tools to Pipecat-friendly function definitions and registers the functions with the llm. Copy Ask AI async def register_tools ( self , llm ) -> ToolsSchema: ​ Additional documentation See MCP’s docs for MCP related updates. OpenTelemetry Observer Pattern On this page Overview Installation Configuration Constructor Parameters Input Parameters Usage Example MCP Stdio Transport Implementation MCP SSE Transport Implementation Methods Additional documentation Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/memory_mem0_03251ffb.txt b/memory_mem0_03251ffb.txt new file mode 100644 index 0000000000000000000000000000000000000000..4d95e74c09f3a621baba08ef4289e3ee2d7efa5c --- /dev/null +++ b/memory_mem0_03251ffb.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/memory/mem0#constructor-parameters +Title: Mem0 - Pipecat +================================================== + +Mem0 - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Memory Mem0 Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Mem0 Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview Mem0MemoryService provides long-term memory capabilities for conversational agents by integrating with Mem0’s API. It automatically stores conversation history and retrieves relevant past context based on the current conversation, enhancing LLM responses with persistent memory across sessions. ​ Installation To use the Mem0 memory service, install the required dependencies: Copy Ask AI pip install "pipecat-ai[mem0]" You’ll also need to set up your Mem0 API key as an environment variable: MEM0_API_KEY . You can obtain a Mem0 API key by signing up at mem0.ai . ​ Mem0MemoryService ​ Constructor Parameters ​ api_key str required Mem0 API key for accessing the service ​ user_id str Unique identifier for the end user to associate with memories ​ agent_id str Identifier for the agent using the memory service ​ run_id str Identifier for the specific conversation session ​ params InputParams Configuration parameters for memory retrieval (see below) ​ local_config dict Configuration for using local LLMs and embedders instead of Mem0’s cloud API (see Local Configuration section) At least one of user_id , agent_id , or run_id must be provided to organize memories. ​ Input Parameters The params object accepts the following configuration settings: ​ search_limit int default: "10" Maximum number of relevant memories to retrieve per query ​ search_threshold float default: "0.1" Relevance threshold for memory retrieval (0.0 to 1.0) ​ api_version str default: "v2" Mem0 API version to use ​ system_prompt str Prefix text to add before retrieved memories ​ add_as_system_message bool default: "True" Whether to add memories as a system message (True) or user message (False) ​ position int default: "1" Position in the context where memories should be inserted ​ Input Frames The service processes the following input frames: ​ OpenAILLMContextFrame Frame Contains OpenAI-specific conversation context ​ LLMMessagesFrame Frame Contains conversation messages in standard format ​ Output Frames The service may produce the following output frames: ​ LLMMessagesFrame Frame Enhanced messages with relevant memories included ​ OpenAILLMContextFrame Frame Enhanced OpenAI context with memories included ​ ErrorFrame Frame Contains error information if memory operations fail ​ Memory Operations The service performs two main operations automatically: ​ Message Storage All conversation messages are stored in Mem0 for future reference. The service: Captures full message history from context frames Associates messages with the specified user/agent/run IDs Stores metadata to enable efficient retrieval ​ Memory Retrieval When a new user message is detected, the service: Uses the message as a search query Retrieves relevant past memories from Mem0 Formats memories with the configured system prompt Adds the formatted memories to the conversation context Passes the enhanced context downstream in the pipeline ​ Pipeline Positioning The memory service should be positioned after the user context aggregator but before the LLM service: Copy Ask AI context_aggregator.user() → memory_service → llm This ensures that: The user’s latest message is included in the context The memory service can enhance the context before the LLM processes it The LLM receives the enhanced context with relevant memories ​ Usage Examples ​ Basic Integration Copy Ask AI from pipecat.services.mem0.memory import Mem0MemoryService from pipecat.pipeline.pipeline import Pipeline # Create the memory service memory = Mem0MemoryService( api_key = os.getenv( "MEM0_API_KEY" ), user_id = "user123" , # Unique user identifier ) # Position the memory service between context aggregator and LLM pipeline = Pipeline([ transport.input(), context_aggregator.user(), memory, # <-- Memory service enhances context here llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Using Local Configuration The local_config parameter allows you to use your own LLM and embedding providers instead of Mem0’s cloud API. This is useful for self-hosted deployments or when you want more control over the memory processing. Copy Ask AI local_config = { "llm" : { "provider" : str , # LLM provider name (e.g., "anthropic", "openai") "config" : { # Provider-specific configuration "model" : str , # Model name "api_key" : str , # API key for the provider # Other provider-specific parameters } }, "embedder" : { "provider" : str , # Embedding provider name (e.g., "openai") "config" : { # Provider-specific configuration "model" : str , # Model name # Other provider-specific parameters } } } # Initialize Mem0 memory service with local configuration memory = Mem0MemoryService( local_config = local_config, # Use local LLM for memory processing user_id = "user123" , # Unique identifier for the user ) When using local_config do not provide the api_key parameter. ​ Frame Flow ​ Error Handling The service includes basic error handling to ensure conversation flow continues even when memory operations fail: Exceptions during memory storage and retrieval are caught and logged If an error occurs during frame processing, an ErrorFrame is emitted with error details The original frame is still passed downstream to prevent the pipeline from stalling Connection and authentication errors from the Mem0 API will be logged but won’t interrupt the conversation While the service attempts to handle errors gracefully, memory operations that fail may result in missing context in conversations. Monitor your application logs for memory-related errors. Tavus Moondream On this page Overview Installation Mem0MemoryService Constructor Parameters Input Parameters Input Frames Output Frames Memory Operations Message Storage Memory Retrieval Pipeline Positioning Usage Examples Basic Integration Using Local Configuration Frame Flow Error Handling Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/memory_mem0_3cfa2d65.txt b/memory_mem0_3cfa2d65.txt new file mode 100644 index 0000000000000000000000000000000000000000..09da974e8b20ba3560952460cc88da6ec0ebb025 --- /dev/null +++ b/memory_mem0_3cfa2d65.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/memory/mem0#using-local-configuration +Title: Mem0 - Pipecat +================================================== + +Mem0 - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Memory Mem0 Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Mem0 Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview Mem0MemoryService provides long-term memory capabilities for conversational agents by integrating with Mem0’s API. It automatically stores conversation history and retrieves relevant past context based on the current conversation, enhancing LLM responses with persistent memory across sessions. ​ Installation To use the Mem0 memory service, install the required dependencies: Copy Ask AI pip install "pipecat-ai[mem0]" You’ll also need to set up your Mem0 API key as an environment variable: MEM0_API_KEY . You can obtain a Mem0 API key by signing up at mem0.ai . ​ Mem0MemoryService ​ Constructor Parameters ​ api_key str required Mem0 API key for accessing the service ​ user_id str Unique identifier for the end user to associate with memories ​ agent_id str Identifier for the agent using the memory service ​ run_id str Identifier for the specific conversation session ​ params InputParams Configuration parameters for memory retrieval (see below) ​ local_config dict Configuration for using local LLMs and embedders instead of Mem0’s cloud API (see Local Configuration section) At least one of user_id , agent_id , or run_id must be provided to organize memories. ​ Input Parameters The params object accepts the following configuration settings: ​ search_limit int default: "10" Maximum number of relevant memories to retrieve per query ​ search_threshold float default: "0.1" Relevance threshold for memory retrieval (0.0 to 1.0) ​ api_version str default: "v2" Mem0 API version to use ​ system_prompt str Prefix text to add before retrieved memories ​ add_as_system_message bool default: "True" Whether to add memories as a system message (True) or user message (False) ​ position int default: "1" Position in the context where memories should be inserted ​ Input Frames The service processes the following input frames: ​ OpenAILLMContextFrame Frame Contains OpenAI-specific conversation context ​ LLMMessagesFrame Frame Contains conversation messages in standard format ​ Output Frames The service may produce the following output frames: ​ LLMMessagesFrame Frame Enhanced messages with relevant memories included ​ OpenAILLMContextFrame Frame Enhanced OpenAI context with memories included ​ ErrorFrame Frame Contains error information if memory operations fail ​ Memory Operations The service performs two main operations automatically: ​ Message Storage All conversation messages are stored in Mem0 for future reference. The service: Captures full message history from context frames Associates messages with the specified user/agent/run IDs Stores metadata to enable efficient retrieval ​ Memory Retrieval When a new user message is detected, the service: Uses the message as a search query Retrieves relevant past memories from Mem0 Formats memories with the configured system prompt Adds the formatted memories to the conversation context Passes the enhanced context downstream in the pipeline ​ Pipeline Positioning The memory service should be positioned after the user context aggregator but before the LLM service: Copy Ask AI context_aggregator.user() → memory_service → llm This ensures that: The user’s latest message is included in the context The memory service can enhance the context before the LLM processes it The LLM receives the enhanced context with relevant memories ​ Usage Examples ​ Basic Integration Copy Ask AI from pipecat.services.mem0.memory import Mem0MemoryService from pipecat.pipeline.pipeline import Pipeline # Create the memory service memory = Mem0MemoryService( api_key = os.getenv( "MEM0_API_KEY" ), user_id = "user123" , # Unique user identifier ) # Position the memory service between context aggregator and LLM pipeline = Pipeline([ transport.input(), context_aggregator.user(), memory, # <-- Memory service enhances context here llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Using Local Configuration The local_config parameter allows you to use your own LLM and embedding providers instead of Mem0’s cloud API. This is useful for self-hosted deployments or when you want more control over the memory processing. Copy Ask AI local_config = { "llm" : { "provider" : str , # LLM provider name (e.g., "anthropic", "openai") "config" : { # Provider-specific configuration "model" : str , # Model name "api_key" : str , # API key for the provider # Other provider-specific parameters } }, "embedder" : { "provider" : str , # Embedding provider name (e.g., "openai") "config" : { # Provider-specific configuration "model" : str , # Model name # Other provider-specific parameters } } } # Initialize Mem0 memory service with local configuration memory = Mem0MemoryService( local_config = local_config, # Use local LLM for memory processing user_id = "user123" , # Unique identifier for the user ) When using local_config do not provide the api_key parameter. ​ Frame Flow ​ Error Handling The service includes basic error handling to ensure conversation flow continues even when memory operations fail: Exceptions during memory storage and retrieval are caught and logged If an error occurs during frame processing, an ErrorFrame is emitted with error details The original frame is still passed downstream to prevent the pipeline from stalling Connection and authentication errors from the Mem0 API will be logged but won’t interrupt the conversation While the service attempts to handle errors gracefully, memory operations that fail may result in missing context in conversations. Monitor your application logs for memory-related errors. Tavus Moondream On this page Overview Installation Mem0MemoryService Constructor Parameters Input Parameters Input Frames Output Frames Memory Operations Message Storage Memory Retrieval Pipeline Positioning Usage Examples Basic Integration Using Local Configuration Frame Flow Error Handling Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/memory_mem0_3f19f368.txt b/memory_mem0_3f19f368.txt new file mode 100644 index 0000000000000000000000000000000000000000..4dd13810772e308ff35b3de20cc37d87a40cd4e4 --- /dev/null +++ b/memory_mem0_3f19f368.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/memory/mem0 +Title: Mem0 - Pipecat +================================================== + +Mem0 - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Memory Mem0 Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Mem0 Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview Mem0MemoryService provides long-term memory capabilities for conversational agents by integrating with Mem0’s API. It automatically stores conversation history and retrieves relevant past context based on the current conversation, enhancing LLM responses with persistent memory across sessions. ​ Installation To use the Mem0 memory service, install the required dependencies: Copy Ask AI pip install "pipecat-ai[mem0]" You’ll also need to set up your Mem0 API key as an environment variable: MEM0_API_KEY . You can obtain a Mem0 API key by signing up at mem0.ai . ​ Mem0MemoryService ​ Constructor Parameters ​ api_key str required Mem0 API key for accessing the service ​ user_id str Unique identifier for the end user to associate with memories ​ agent_id str Identifier for the agent using the memory service ​ run_id str Identifier for the specific conversation session ​ params InputParams Configuration parameters for memory retrieval (see below) ​ local_config dict Configuration for using local LLMs and embedders instead of Mem0’s cloud API (see Local Configuration section) At least one of user_id , agent_id , or run_id must be provided to organize memories. ​ Input Parameters The params object accepts the following configuration settings: ​ search_limit int default: "10" Maximum number of relevant memories to retrieve per query ​ search_threshold float default: "0.1" Relevance threshold for memory retrieval (0.0 to 1.0) ​ api_version str default: "v2" Mem0 API version to use ​ system_prompt str Prefix text to add before retrieved memories ​ add_as_system_message bool default: "True" Whether to add memories as a system message (True) or user message (False) ​ position int default: "1" Position in the context where memories should be inserted ​ Input Frames The service processes the following input frames: ​ OpenAILLMContextFrame Frame Contains OpenAI-specific conversation context ​ LLMMessagesFrame Frame Contains conversation messages in standard format ​ Output Frames The service may produce the following output frames: ​ LLMMessagesFrame Frame Enhanced messages with relevant memories included ​ OpenAILLMContextFrame Frame Enhanced OpenAI context with memories included ​ ErrorFrame Frame Contains error information if memory operations fail ​ Memory Operations The service performs two main operations automatically: ​ Message Storage All conversation messages are stored in Mem0 for future reference. The service: Captures full message history from context frames Associates messages with the specified user/agent/run IDs Stores metadata to enable efficient retrieval ​ Memory Retrieval When a new user message is detected, the service: Uses the message as a search query Retrieves relevant past memories from Mem0 Formats memories with the configured system prompt Adds the formatted memories to the conversation context Passes the enhanced context downstream in the pipeline ​ Pipeline Positioning The memory service should be positioned after the user context aggregator but before the LLM service: Copy Ask AI context_aggregator.user() → memory_service → llm This ensures that: The user’s latest message is included in the context The memory service can enhance the context before the LLM processes it The LLM receives the enhanced context with relevant memories ​ Usage Examples ​ Basic Integration Copy Ask AI from pipecat.services.mem0.memory import Mem0MemoryService from pipecat.pipeline.pipeline import Pipeline # Create the memory service memory = Mem0MemoryService( api_key = os.getenv( "MEM0_API_KEY" ), user_id = "user123" , # Unique user identifier ) # Position the memory service between context aggregator and LLM pipeline = Pipeline([ transport.input(), context_aggregator.user(), memory, # <-- Memory service enhances context here llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Using Local Configuration The local_config parameter allows you to use your own LLM and embedding providers instead of Mem0’s cloud API. This is useful for self-hosted deployments or when you want more control over the memory processing. Copy Ask AI local_config = { "llm" : { "provider" : str , # LLM provider name (e.g., "anthropic", "openai") "config" : { # Provider-specific configuration "model" : str , # Model name "api_key" : str , # API key for the provider # Other provider-specific parameters } }, "embedder" : { "provider" : str , # Embedding provider name (e.g., "openai") "config" : { # Provider-specific configuration "model" : str , # Model name # Other provider-specific parameters } } } # Initialize Mem0 memory service with local configuration memory = Mem0MemoryService( local_config = local_config, # Use local LLM for memory processing user_id = "user123" , # Unique identifier for the user ) When using local_config do not provide the api_key parameter. ​ Frame Flow ​ Error Handling The service includes basic error handling to ensure conversation flow continues even when memory operations fail: Exceptions during memory storage and retrieval are caught and logged If an error occurs during frame processing, an ErrorFrame is emitted with error details The original frame is still passed downstream to prevent the pipeline from stalling Connection and authentication errors from the Mem0 API will be logged but won’t interrupt the conversation While the service attempts to handle errors gracefully, memory operations that fail may result in missing context in conversations. Monitor your application logs for memory-related errors. Tavus Moondream On this page Overview Installation Mem0MemoryService Constructor Parameters Input Parameters Input Frames Output Frames Memory Operations Message Storage Memory Retrieval Pipeline Positioning Usage Examples Basic Integration Using Local Configuration Frame Flow Error Handling Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/memory_mem0_4f441906.txt b/memory_mem0_4f441906.txt new file mode 100644 index 0000000000000000000000000000000000000000..47864af48696bc0b0d7346b77e7f99ff69c4bf5b --- /dev/null +++ b/memory_mem0_4f441906.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/memory/mem0#usage-examples +Title: Mem0 - Pipecat +================================================== + +Mem0 - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Memory Mem0 Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Mem0 Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview Mem0MemoryService provides long-term memory capabilities for conversational agents by integrating with Mem0’s API. It automatically stores conversation history and retrieves relevant past context based on the current conversation, enhancing LLM responses with persistent memory across sessions. ​ Installation To use the Mem0 memory service, install the required dependencies: Copy Ask AI pip install "pipecat-ai[mem0]" You’ll also need to set up your Mem0 API key as an environment variable: MEM0_API_KEY . You can obtain a Mem0 API key by signing up at mem0.ai . ​ Mem0MemoryService ​ Constructor Parameters ​ api_key str required Mem0 API key for accessing the service ​ user_id str Unique identifier for the end user to associate with memories ​ agent_id str Identifier for the agent using the memory service ​ run_id str Identifier for the specific conversation session ​ params InputParams Configuration parameters for memory retrieval (see below) ​ local_config dict Configuration for using local LLMs and embedders instead of Mem0’s cloud API (see Local Configuration section) At least one of user_id , agent_id , or run_id must be provided to organize memories. ​ Input Parameters The params object accepts the following configuration settings: ​ search_limit int default: "10" Maximum number of relevant memories to retrieve per query ​ search_threshold float default: "0.1" Relevance threshold for memory retrieval (0.0 to 1.0) ​ api_version str default: "v2" Mem0 API version to use ​ system_prompt str Prefix text to add before retrieved memories ​ add_as_system_message bool default: "True" Whether to add memories as a system message (True) or user message (False) ​ position int default: "1" Position in the context where memories should be inserted ​ Input Frames The service processes the following input frames: ​ OpenAILLMContextFrame Frame Contains OpenAI-specific conversation context ​ LLMMessagesFrame Frame Contains conversation messages in standard format ​ Output Frames The service may produce the following output frames: ​ LLMMessagesFrame Frame Enhanced messages with relevant memories included ​ OpenAILLMContextFrame Frame Enhanced OpenAI context with memories included ​ ErrorFrame Frame Contains error information if memory operations fail ​ Memory Operations The service performs two main operations automatically: ​ Message Storage All conversation messages are stored in Mem0 for future reference. The service: Captures full message history from context frames Associates messages with the specified user/agent/run IDs Stores metadata to enable efficient retrieval ​ Memory Retrieval When a new user message is detected, the service: Uses the message as a search query Retrieves relevant past memories from Mem0 Formats memories with the configured system prompt Adds the formatted memories to the conversation context Passes the enhanced context downstream in the pipeline ​ Pipeline Positioning The memory service should be positioned after the user context aggregator but before the LLM service: Copy Ask AI context_aggregator.user() → memory_service → llm This ensures that: The user’s latest message is included in the context The memory service can enhance the context before the LLM processes it The LLM receives the enhanced context with relevant memories ​ Usage Examples ​ Basic Integration Copy Ask AI from pipecat.services.mem0.memory import Mem0MemoryService from pipecat.pipeline.pipeline import Pipeline # Create the memory service memory = Mem0MemoryService( api_key = os.getenv( "MEM0_API_KEY" ), user_id = "user123" , # Unique user identifier ) # Position the memory service between context aggregator and LLM pipeline = Pipeline([ transport.input(), context_aggregator.user(), memory, # <-- Memory service enhances context here llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Using Local Configuration The local_config parameter allows you to use your own LLM and embedding providers instead of Mem0’s cloud API. This is useful for self-hosted deployments or when you want more control over the memory processing. Copy Ask AI local_config = { "llm" : { "provider" : str , # LLM provider name (e.g., "anthropic", "openai") "config" : { # Provider-specific configuration "model" : str , # Model name "api_key" : str , # API key for the provider # Other provider-specific parameters } }, "embedder" : { "provider" : str , # Embedding provider name (e.g., "openai") "config" : { # Provider-specific configuration "model" : str , # Model name # Other provider-specific parameters } } } # Initialize Mem0 memory service with local configuration memory = Mem0MemoryService( local_config = local_config, # Use local LLM for memory processing user_id = "user123" , # Unique identifier for the user ) When using local_config do not provide the api_key parameter. ​ Frame Flow ​ Error Handling The service includes basic error handling to ensure conversation flow continues even when memory operations fail: Exceptions during memory storage and retrieval are caught and logged If an error occurs during frame processing, an ErrorFrame is emitted with error details The original frame is still passed downstream to prevent the pipeline from stalling Connection and authentication errors from the Mem0 API will be logged but won’t interrupt the conversation While the service attempts to handle errors gracefully, memory operations that fail may result in missing context in conversations. Monitor your application logs for memory-related errors. Tavus Moondream On this page Overview Installation Mem0MemoryService Constructor Parameters Input Parameters Input Frames Output Frames Memory Operations Message Storage Memory Retrieval Pipeline Positioning Usage Examples Basic Integration Using Local Configuration Frame Flow Error Handling Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/memory_mem0_594927f2.txt b/memory_mem0_594927f2.txt new file mode 100644 index 0000000000000000000000000000000000000000..5c899c1a22415b1d4fbfa901210bc797365683ae --- /dev/null +++ b/memory_mem0_594927f2.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/memory/mem0#param-open-aillm-context-frame-1 +Title: Mem0 - Pipecat +================================================== + +Mem0 - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Memory Mem0 Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Mem0 Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview Mem0MemoryService provides long-term memory capabilities for conversational agents by integrating with Mem0’s API. It automatically stores conversation history and retrieves relevant past context based on the current conversation, enhancing LLM responses with persistent memory across sessions. ​ Installation To use the Mem0 memory service, install the required dependencies: Copy Ask AI pip install "pipecat-ai[mem0]" You’ll also need to set up your Mem0 API key as an environment variable: MEM0_API_KEY . You can obtain a Mem0 API key by signing up at mem0.ai . ​ Mem0MemoryService ​ Constructor Parameters ​ api_key str required Mem0 API key for accessing the service ​ user_id str Unique identifier for the end user to associate with memories ​ agent_id str Identifier for the agent using the memory service ​ run_id str Identifier for the specific conversation session ​ params InputParams Configuration parameters for memory retrieval (see below) ​ local_config dict Configuration for using local LLMs and embedders instead of Mem0’s cloud API (see Local Configuration section) At least one of user_id , agent_id , or run_id must be provided to organize memories. ​ Input Parameters The params object accepts the following configuration settings: ​ search_limit int default: "10" Maximum number of relevant memories to retrieve per query ​ search_threshold float default: "0.1" Relevance threshold for memory retrieval (0.0 to 1.0) ​ api_version str default: "v2" Mem0 API version to use ​ system_prompt str Prefix text to add before retrieved memories ​ add_as_system_message bool default: "True" Whether to add memories as a system message (True) or user message (False) ​ position int default: "1" Position in the context where memories should be inserted ​ Input Frames The service processes the following input frames: ​ OpenAILLMContextFrame Frame Contains OpenAI-specific conversation context ​ LLMMessagesFrame Frame Contains conversation messages in standard format ​ Output Frames The service may produce the following output frames: ​ LLMMessagesFrame Frame Enhanced messages with relevant memories included ​ OpenAILLMContextFrame Frame Enhanced OpenAI context with memories included ​ ErrorFrame Frame Contains error information if memory operations fail ​ Memory Operations The service performs two main operations automatically: ​ Message Storage All conversation messages are stored in Mem0 for future reference. The service: Captures full message history from context frames Associates messages with the specified user/agent/run IDs Stores metadata to enable efficient retrieval ​ Memory Retrieval When a new user message is detected, the service: Uses the message as a search query Retrieves relevant past memories from Mem0 Formats memories with the configured system prompt Adds the formatted memories to the conversation context Passes the enhanced context downstream in the pipeline ​ Pipeline Positioning The memory service should be positioned after the user context aggregator but before the LLM service: Copy Ask AI context_aggregator.user() → memory_service → llm This ensures that: The user’s latest message is included in the context The memory service can enhance the context before the LLM processes it The LLM receives the enhanced context with relevant memories ​ Usage Examples ​ Basic Integration Copy Ask AI from pipecat.services.mem0.memory import Mem0MemoryService from pipecat.pipeline.pipeline import Pipeline # Create the memory service memory = Mem0MemoryService( api_key = os.getenv( "MEM0_API_KEY" ), user_id = "user123" , # Unique user identifier ) # Position the memory service between context aggregator and LLM pipeline = Pipeline([ transport.input(), context_aggregator.user(), memory, # <-- Memory service enhances context here llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Using Local Configuration The local_config parameter allows you to use your own LLM and embedding providers instead of Mem0’s cloud API. This is useful for self-hosted deployments or when you want more control over the memory processing. Copy Ask AI local_config = { "llm" : { "provider" : str , # LLM provider name (e.g., "anthropic", "openai") "config" : { # Provider-specific configuration "model" : str , # Model name "api_key" : str , # API key for the provider # Other provider-specific parameters } }, "embedder" : { "provider" : str , # Embedding provider name (e.g., "openai") "config" : { # Provider-specific configuration "model" : str , # Model name # Other provider-specific parameters } } } # Initialize Mem0 memory service with local configuration memory = Mem0MemoryService( local_config = local_config, # Use local LLM for memory processing user_id = "user123" , # Unique identifier for the user ) When using local_config do not provide the api_key parameter. ​ Frame Flow ​ Error Handling The service includes basic error handling to ensure conversation flow continues even when memory operations fail: Exceptions during memory storage and retrieval are caught and logged If an error occurs during frame processing, an ErrorFrame is emitted with error details The original frame is still passed downstream to prevent the pipeline from stalling Connection and authentication errors from the Mem0 API will be logged but won’t interrupt the conversation While the service attempts to handle errors gracefully, memory operations that fail may result in missing context in conversations. Monitor your application logs for memory-related errors. Tavus Moondream On this page Overview Installation Mem0MemoryService Constructor Parameters Input Parameters Input Frames Output Frames Memory Operations Message Storage Memory Retrieval Pipeline Positioning Usage Examples Basic Integration Using Local Configuration Frame Flow Error Handling Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/memory_mem0_71cacf22.txt b/memory_mem0_71cacf22.txt new file mode 100644 index 0000000000000000000000000000000000000000..57ca5b29b3872d4f4530e35882efd65c0726a3ca --- /dev/null +++ b/memory_mem0_71cacf22.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/memory/mem0#memory-operations +Title: Mem0 - Pipecat +================================================== + +Mem0 - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Memory Mem0 Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Mem0 Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview Mem0MemoryService provides long-term memory capabilities for conversational agents by integrating with Mem0’s API. It automatically stores conversation history and retrieves relevant past context based on the current conversation, enhancing LLM responses with persistent memory across sessions. ​ Installation To use the Mem0 memory service, install the required dependencies: Copy Ask AI pip install "pipecat-ai[mem0]" You’ll also need to set up your Mem0 API key as an environment variable: MEM0_API_KEY . You can obtain a Mem0 API key by signing up at mem0.ai . ​ Mem0MemoryService ​ Constructor Parameters ​ api_key str required Mem0 API key for accessing the service ​ user_id str Unique identifier for the end user to associate with memories ​ agent_id str Identifier for the agent using the memory service ​ run_id str Identifier for the specific conversation session ​ params InputParams Configuration parameters for memory retrieval (see below) ​ local_config dict Configuration for using local LLMs and embedders instead of Mem0’s cloud API (see Local Configuration section) At least one of user_id , agent_id , or run_id must be provided to organize memories. ​ Input Parameters The params object accepts the following configuration settings: ​ search_limit int default: "10" Maximum number of relevant memories to retrieve per query ​ search_threshold float default: "0.1" Relevance threshold for memory retrieval (0.0 to 1.0) ​ api_version str default: "v2" Mem0 API version to use ​ system_prompt str Prefix text to add before retrieved memories ​ add_as_system_message bool default: "True" Whether to add memories as a system message (True) or user message (False) ​ position int default: "1" Position in the context where memories should be inserted ​ Input Frames The service processes the following input frames: ​ OpenAILLMContextFrame Frame Contains OpenAI-specific conversation context ​ LLMMessagesFrame Frame Contains conversation messages in standard format ​ Output Frames The service may produce the following output frames: ​ LLMMessagesFrame Frame Enhanced messages with relevant memories included ​ OpenAILLMContextFrame Frame Enhanced OpenAI context with memories included ​ ErrorFrame Frame Contains error information if memory operations fail ​ Memory Operations The service performs two main operations automatically: ​ Message Storage All conversation messages are stored in Mem0 for future reference. The service: Captures full message history from context frames Associates messages with the specified user/agent/run IDs Stores metadata to enable efficient retrieval ​ Memory Retrieval When a new user message is detected, the service: Uses the message as a search query Retrieves relevant past memories from Mem0 Formats memories with the configured system prompt Adds the formatted memories to the conversation context Passes the enhanced context downstream in the pipeline ​ Pipeline Positioning The memory service should be positioned after the user context aggregator but before the LLM service: Copy Ask AI context_aggregator.user() → memory_service → llm This ensures that: The user’s latest message is included in the context The memory service can enhance the context before the LLM processes it The LLM receives the enhanced context with relevant memories ​ Usage Examples ​ Basic Integration Copy Ask AI from pipecat.services.mem0.memory import Mem0MemoryService from pipecat.pipeline.pipeline import Pipeline # Create the memory service memory = Mem0MemoryService( api_key = os.getenv( "MEM0_API_KEY" ), user_id = "user123" , # Unique user identifier ) # Position the memory service between context aggregator and LLM pipeline = Pipeline([ transport.input(), context_aggregator.user(), memory, # <-- Memory service enhances context here llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Using Local Configuration The local_config parameter allows you to use your own LLM and embedding providers instead of Mem0’s cloud API. This is useful for self-hosted deployments or when you want more control over the memory processing. Copy Ask AI local_config = { "llm" : { "provider" : str , # LLM provider name (e.g., "anthropic", "openai") "config" : { # Provider-specific configuration "model" : str , # Model name "api_key" : str , # API key for the provider # Other provider-specific parameters } }, "embedder" : { "provider" : str , # Embedding provider name (e.g., "openai") "config" : { # Provider-specific configuration "model" : str , # Model name # Other provider-specific parameters } } } # Initialize Mem0 memory service with local configuration memory = Mem0MemoryService( local_config = local_config, # Use local LLM for memory processing user_id = "user123" , # Unique identifier for the user ) When using local_config do not provide the api_key parameter. ​ Frame Flow ​ Error Handling The service includes basic error handling to ensure conversation flow continues even when memory operations fail: Exceptions during memory storage and retrieval are caught and logged If an error occurs during frame processing, an ErrorFrame is emitted with error details The original frame is still passed downstream to prevent the pipeline from stalling Connection and authentication errors from the Mem0 API will be logged but won’t interrupt the conversation While the service attempts to handle errors gracefully, memory operations that fail may result in missing context in conversations. Monitor your application logs for memory-related errors. Tavus Moondream On this page Overview Installation Mem0MemoryService Constructor Parameters Input Parameters Input Frames Output Frames Memory Operations Message Storage Memory Retrieval Pipeline Positioning Usage Examples Basic Integration Using Local Configuration Frame Flow Error Handling Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/memory_mem0_9db48249.txt b/memory_mem0_9db48249.txt new file mode 100644 index 0000000000000000000000000000000000000000..9bff60cd0bd76285717118312d5ba43a8f55d80d --- /dev/null +++ b/memory_mem0_9db48249.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/memory/mem0#pipeline-positioning +Title: Mem0 - Pipecat +================================================== + +Mem0 - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Memory Mem0 Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Mem0 Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview Mem0MemoryService provides long-term memory capabilities for conversational agents by integrating with Mem0’s API. It automatically stores conversation history and retrieves relevant past context based on the current conversation, enhancing LLM responses with persistent memory across sessions. ​ Installation To use the Mem0 memory service, install the required dependencies: Copy Ask AI pip install "pipecat-ai[mem0]" You’ll also need to set up your Mem0 API key as an environment variable: MEM0_API_KEY . You can obtain a Mem0 API key by signing up at mem0.ai . ​ Mem0MemoryService ​ Constructor Parameters ​ api_key str required Mem0 API key for accessing the service ​ user_id str Unique identifier for the end user to associate with memories ​ agent_id str Identifier for the agent using the memory service ​ run_id str Identifier for the specific conversation session ​ params InputParams Configuration parameters for memory retrieval (see below) ​ local_config dict Configuration for using local LLMs and embedders instead of Mem0’s cloud API (see Local Configuration section) At least one of user_id , agent_id , or run_id must be provided to organize memories. ​ Input Parameters The params object accepts the following configuration settings: ​ search_limit int default: "10" Maximum number of relevant memories to retrieve per query ​ search_threshold float default: "0.1" Relevance threshold for memory retrieval (0.0 to 1.0) ​ api_version str default: "v2" Mem0 API version to use ​ system_prompt str Prefix text to add before retrieved memories ​ add_as_system_message bool default: "True" Whether to add memories as a system message (True) or user message (False) ​ position int default: "1" Position in the context where memories should be inserted ​ Input Frames The service processes the following input frames: ​ OpenAILLMContextFrame Frame Contains OpenAI-specific conversation context ​ LLMMessagesFrame Frame Contains conversation messages in standard format ​ Output Frames The service may produce the following output frames: ​ LLMMessagesFrame Frame Enhanced messages with relevant memories included ​ OpenAILLMContextFrame Frame Enhanced OpenAI context with memories included ​ ErrorFrame Frame Contains error information if memory operations fail ​ Memory Operations The service performs two main operations automatically: ​ Message Storage All conversation messages are stored in Mem0 for future reference. The service: Captures full message history from context frames Associates messages with the specified user/agent/run IDs Stores metadata to enable efficient retrieval ​ Memory Retrieval When a new user message is detected, the service: Uses the message as a search query Retrieves relevant past memories from Mem0 Formats memories with the configured system prompt Adds the formatted memories to the conversation context Passes the enhanced context downstream in the pipeline ​ Pipeline Positioning The memory service should be positioned after the user context aggregator but before the LLM service: Copy Ask AI context_aggregator.user() → memory_service → llm This ensures that: The user’s latest message is included in the context The memory service can enhance the context before the LLM processes it The LLM receives the enhanced context with relevant memories ​ Usage Examples ​ Basic Integration Copy Ask AI from pipecat.services.mem0.memory import Mem0MemoryService from pipecat.pipeline.pipeline import Pipeline # Create the memory service memory = Mem0MemoryService( api_key = os.getenv( "MEM0_API_KEY" ), user_id = "user123" , # Unique user identifier ) # Position the memory service between context aggregator and LLM pipeline = Pipeline([ transport.input(), context_aggregator.user(), memory, # <-- Memory service enhances context here llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Using Local Configuration The local_config parameter allows you to use your own LLM and embedding providers instead of Mem0’s cloud API. This is useful for self-hosted deployments or when you want more control over the memory processing. Copy Ask AI local_config = { "llm" : { "provider" : str , # LLM provider name (e.g., "anthropic", "openai") "config" : { # Provider-specific configuration "model" : str , # Model name "api_key" : str , # API key for the provider # Other provider-specific parameters } }, "embedder" : { "provider" : str , # Embedding provider name (e.g., "openai") "config" : { # Provider-specific configuration "model" : str , # Model name # Other provider-specific parameters } } } # Initialize Mem0 memory service with local configuration memory = Mem0MemoryService( local_config = local_config, # Use local LLM for memory processing user_id = "user123" , # Unique identifier for the user ) When using local_config do not provide the api_key parameter. ​ Frame Flow ​ Error Handling The service includes basic error handling to ensure conversation flow continues even when memory operations fail: Exceptions during memory storage and retrieval are caught and logged If an error occurs during frame processing, an ErrorFrame is emitted with error details The original frame is still passed downstream to prevent the pipeline from stalling Connection and authentication errors from the Mem0 API will be logged but won’t interrupt the conversation While the service attempts to handle errors gracefully, memory operations that fail may result in missing context in conversations. Monitor your application logs for memory-related errors. Tavus Moondream On this page Overview Installation Mem0MemoryService Constructor Parameters Input Parameters Input Frames Output Frames Memory Operations Message Storage Memory Retrieval Pipeline Positioning Usage Examples Basic Integration Using Local Configuration Frame Flow Error Handling Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/memory_mem0_b90d74f3.txt b/memory_mem0_b90d74f3.txt new file mode 100644 index 0000000000000000000000000000000000000000..acefed134993e8de302f7d3aef80a603aaa99d48 --- /dev/null +++ b/memory_mem0_b90d74f3.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/memory/mem0#memory-retrieval +Title: Mem0 - Pipecat +================================================== + +Mem0 - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Memory Mem0 Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Mem0 Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview Mem0MemoryService provides long-term memory capabilities for conversational agents by integrating with Mem0’s API. It automatically stores conversation history and retrieves relevant past context based on the current conversation, enhancing LLM responses with persistent memory across sessions. ​ Installation To use the Mem0 memory service, install the required dependencies: Copy Ask AI pip install "pipecat-ai[mem0]" You’ll also need to set up your Mem0 API key as an environment variable: MEM0_API_KEY . You can obtain a Mem0 API key by signing up at mem0.ai . ​ Mem0MemoryService ​ Constructor Parameters ​ api_key str required Mem0 API key for accessing the service ​ user_id str Unique identifier for the end user to associate with memories ​ agent_id str Identifier for the agent using the memory service ​ run_id str Identifier for the specific conversation session ​ params InputParams Configuration parameters for memory retrieval (see below) ​ local_config dict Configuration for using local LLMs and embedders instead of Mem0’s cloud API (see Local Configuration section) At least one of user_id , agent_id , or run_id must be provided to organize memories. ​ Input Parameters The params object accepts the following configuration settings: ​ search_limit int default: "10" Maximum number of relevant memories to retrieve per query ​ search_threshold float default: "0.1" Relevance threshold for memory retrieval (0.0 to 1.0) ​ api_version str default: "v2" Mem0 API version to use ​ system_prompt str Prefix text to add before retrieved memories ​ add_as_system_message bool default: "True" Whether to add memories as a system message (True) or user message (False) ​ position int default: "1" Position in the context where memories should be inserted ​ Input Frames The service processes the following input frames: ​ OpenAILLMContextFrame Frame Contains OpenAI-specific conversation context ​ LLMMessagesFrame Frame Contains conversation messages in standard format ​ Output Frames The service may produce the following output frames: ​ LLMMessagesFrame Frame Enhanced messages with relevant memories included ​ OpenAILLMContextFrame Frame Enhanced OpenAI context with memories included ​ ErrorFrame Frame Contains error information if memory operations fail ​ Memory Operations The service performs two main operations automatically: ​ Message Storage All conversation messages are stored in Mem0 for future reference. The service: Captures full message history from context frames Associates messages with the specified user/agent/run IDs Stores metadata to enable efficient retrieval ​ Memory Retrieval When a new user message is detected, the service: Uses the message as a search query Retrieves relevant past memories from Mem0 Formats memories with the configured system prompt Adds the formatted memories to the conversation context Passes the enhanced context downstream in the pipeline ​ Pipeline Positioning The memory service should be positioned after the user context aggregator but before the LLM service: Copy Ask AI context_aggregator.user() → memory_service → llm This ensures that: The user’s latest message is included in the context The memory service can enhance the context before the LLM processes it The LLM receives the enhanced context with relevant memories ​ Usage Examples ​ Basic Integration Copy Ask AI from pipecat.services.mem0.memory import Mem0MemoryService from pipecat.pipeline.pipeline import Pipeline # Create the memory service memory = Mem0MemoryService( api_key = os.getenv( "MEM0_API_KEY" ), user_id = "user123" , # Unique user identifier ) # Position the memory service between context aggregator and LLM pipeline = Pipeline([ transport.input(), context_aggregator.user(), memory, # <-- Memory service enhances context here llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Using Local Configuration The local_config parameter allows you to use your own LLM and embedding providers instead of Mem0’s cloud API. This is useful for self-hosted deployments or when you want more control over the memory processing. Copy Ask AI local_config = { "llm" : { "provider" : str , # LLM provider name (e.g., "anthropic", "openai") "config" : { # Provider-specific configuration "model" : str , # Model name "api_key" : str , # API key for the provider # Other provider-specific parameters } }, "embedder" : { "provider" : str , # Embedding provider name (e.g., "openai") "config" : { # Provider-specific configuration "model" : str , # Model name # Other provider-specific parameters } } } # Initialize Mem0 memory service with local configuration memory = Mem0MemoryService( local_config = local_config, # Use local LLM for memory processing user_id = "user123" , # Unique identifier for the user ) When using local_config do not provide the api_key parameter. ​ Frame Flow ​ Error Handling The service includes basic error handling to ensure conversation flow continues even when memory operations fail: Exceptions during memory storage and retrieval are caught and logged If an error occurs during frame processing, an ErrorFrame is emitted with error details The original frame is still passed downstream to prevent the pipeline from stalling Connection and authentication errors from the Mem0 API will be logged but won’t interrupt the conversation While the service attempts to handle errors gracefully, memory operations that fail may result in missing context in conversations. Monitor your application logs for memory-related errors. Tavus Moondream On this page Overview Installation Mem0MemoryService Constructor Parameters Input Parameters Input Frames Output Frames Memory Operations Message Storage Memory Retrieval Pipeline Positioning Usage Examples Basic Integration Using Local Configuration Frame Flow Error Handling Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/memory_mem0_cb147e82.txt b/memory_mem0_cb147e82.txt new file mode 100644 index 0000000000000000000000000000000000000000..9a27cef1266b1ac4d0dfa0ce26a7fd4057e0c507 --- /dev/null +++ b/memory_mem0_cb147e82.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/memory/mem0#param-local-config +Title: Mem0 - Pipecat +================================================== + +Mem0 - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Memory Mem0 Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Mem0 Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview Mem0MemoryService provides long-term memory capabilities for conversational agents by integrating with Mem0’s API. It automatically stores conversation history and retrieves relevant past context based on the current conversation, enhancing LLM responses with persistent memory across sessions. ​ Installation To use the Mem0 memory service, install the required dependencies: Copy Ask AI pip install "pipecat-ai[mem0]" You’ll also need to set up your Mem0 API key as an environment variable: MEM0_API_KEY . You can obtain a Mem0 API key by signing up at mem0.ai . ​ Mem0MemoryService ​ Constructor Parameters ​ api_key str required Mem0 API key for accessing the service ​ user_id str Unique identifier for the end user to associate with memories ​ agent_id str Identifier for the agent using the memory service ​ run_id str Identifier for the specific conversation session ​ params InputParams Configuration parameters for memory retrieval (see below) ​ local_config dict Configuration for using local LLMs and embedders instead of Mem0’s cloud API (see Local Configuration section) At least one of user_id , agent_id , or run_id must be provided to organize memories. ​ Input Parameters The params object accepts the following configuration settings: ​ search_limit int default: "10" Maximum number of relevant memories to retrieve per query ​ search_threshold float default: "0.1" Relevance threshold for memory retrieval (0.0 to 1.0) ​ api_version str default: "v2" Mem0 API version to use ​ system_prompt str Prefix text to add before retrieved memories ​ add_as_system_message bool default: "True" Whether to add memories as a system message (True) or user message (False) ​ position int default: "1" Position in the context where memories should be inserted ​ Input Frames The service processes the following input frames: ​ OpenAILLMContextFrame Frame Contains OpenAI-specific conversation context ​ LLMMessagesFrame Frame Contains conversation messages in standard format ​ Output Frames The service may produce the following output frames: ​ LLMMessagesFrame Frame Enhanced messages with relevant memories included ​ OpenAILLMContextFrame Frame Enhanced OpenAI context with memories included ​ ErrorFrame Frame Contains error information if memory operations fail ​ Memory Operations The service performs two main operations automatically: ​ Message Storage All conversation messages are stored in Mem0 for future reference. The service: Captures full message history from context frames Associates messages with the specified user/agent/run IDs Stores metadata to enable efficient retrieval ​ Memory Retrieval When a new user message is detected, the service: Uses the message as a search query Retrieves relevant past memories from Mem0 Formats memories with the configured system prompt Adds the formatted memories to the conversation context Passes the enhanced context downstream in the pipeline ​ Pipeline Positioning The memory service should be positioned after the user context aggregator but before the LLM service: Copy Ask AI context_aggregator.user() → memory_service → llm This ensures that: The user’s latest message is included in the context The memory service can enhance the context before the LLM processes it The LLM receives the enhanced context with relevant memories ​ Usage Examples ​ Basic Integration Copy Ask AI from pipecat.services.mem0.memory import Mem0MemoryService from pipecat.pipeline.pipeline import Pipeline # Create the memory service memory = Mem0MemoryService( api_key = os.getenv( "MEM0_API_KEY" ), user_id = "user123" , # Unique user identifier ) # Position the memory service between context aggregator and LLM pipeline = Pipeline([ transport.input(), context_aggregator.user(), memory, # <-- Memory service enhances context here llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Using Local Configuration The local_config parameter allows you to use your own LLM and embedding providers instead of Mem0’s cloud API. This is useful for self-hosted deployments or when you want more control over the memory processing. Copy Ask AI local_config = { "llm" : { "provider" : str , # LLM provider name (e.g., "anthropic", "openai") "config" : { # Provider-specific configuration "model" : str , # Model name "api_key" : str , # API key for the provider # Other provider-specific parameters } }, "embedder" : { "provider" : str , # Embedding provider name (e.g., "openai") "config" : { # Provider-specific configuration "model" : str , # Model name # Other provider-specific parameters } } } # Initialize Mem0 memory service with local configuration memory = Mem0MemoryService( local_config = local_config, # Use local LLM for memory processing user_id = "user123" , # Unique identifier for the user ) When using local_config do not provide the api_key parameter. ​ Frame Flow ​ Error Handling The service includes basic error handling to ensure conversation flow continues even when memory operations fail: Exceptions during memory storage and retrieval are caught and logged If an error occurs during frame processing, an ErrorFrame is emitted with error details The original frame is still passed downstream to prevent the pipeline from stalling Connection and authentication errors from the Mem0 API will be logged but won’t interrupt the conversation While the service attempts to handle errors gracefully, memory operations that fail may result in missing context in conversations. Monitor your application logs for memory-related errors. Tavus Moondream On this page Overview Installation Mem0MemoryService Constructor Parameters Input Parameters Input Frames Output Frames Memory Operations Message Storage Memory Retrieval Pipeline Positioning Usage Examples Basic Integration Using Local Configuration Frame Flow Error Handling Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/memory_mem0_cde96ce2.txt b/memory_mem0_cde96ce2.txt new file mode 100644 index 0000000000000000000000000000000000000000..19d594de7a59283bb621f43611d0f74faf646066 --- /dev/null +++ b/memory_mem0_cde96ce2.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/memory/mem0#param-open-aillm-context-frame +Title: Mem0 - Pipecat +================================================== + +Mem0 - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Memory Mem0 Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Mem0 Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview Mem0MemoryService provides long-term memory capabilities for conversational agents by integrating with Mem0’s API. It automatically stores conversation history and retrieves relevant past context based on the current conversation, enhancing LLM responses with persistent memory across sessions. ​ Installation To use the Mem0 memory service, install the required dependencies: Copy Ask AI pip install "pipecat-ai[mem0]" You’ll also need to set up your Mem0 API key as an environment variable: MEM0_API_KEY . You can obtain a Mem0 API key by signing up at mem0.ai . ​ Mem0MemoryService ​ Constructor Parameters ​ api_key str required Mem0 API key for accessing the service ​ user_id str Unique identifier for the end user to associate with memories ​ agent_id str Identifier for the agent using the memory service ​ run_id str Identifier for the specific conversation session ​ params InputParams Configuration parameters for memory retrieval (see below) ​ local_config dict Configuration for using local LLMs and embedders instead of Mem0’s cloud API (see Local Configuration section) At least one of user_id , agent_id , or run_id must be provided to organize memories. ​ Input Parameters The params object accepts the following configuration settings: ​ search_limit int default: "10" Maximum number of relevant memories to retrieve per query ​ search_threshold float default: "0.1" Relevance threshold for memory retrieval (0.0 to 1.0) ​ api_version str default: "v2" Mem0 API version to use ​ system_prompt str Prefix text to add before retrieved memories ​ add_as_system_message bool default: "True" Whether to add memories as a system message (True) or user message (False) ​ position int default: "1" Position in the context where memories should be inserted ​ Input Frames The service processes the following input frames: ​ OpenAILLMContextFrame Frame Contains OpenAI-specific conversation context ​ LLMMessagesFrame Frame Contains conversation messages in standard format ​ Output Frames The service may produce the following output frames: ​ LLMMessagesFrame Frame Enhanced messages with relevant memories included ​ OpenAILLMContextFrame Frame Enhanced OpenAI context with memories included ​ ErrorFrame Frame Contains error information if memory operations fail ​ Memory Operations The service performs two main operations automatically: ​ Message Storage All conversation messages are stored in Mem0 for future reference. The service: Captures full message history from context frames Associates messages with the specified user/agent/run IDs Stores metadata to enable efficient retrieval ​ Memory Retrieval When a new user message is detected, the service: Uses the message as a search query Retrieves relevant past memories from Mem0 Formats memories with the configured system prompt Adds the formatted memories to the conversation context Passes the enhanced context downstream in the pipeline ​ Pipeline Positioning The memory service should be positioned after the user context aggregator but before the LLM service: Copy Ask AI context_aggregator.user() → memory_service → llm This ensures that: The user’s latest message is included in the context The memory service can enhance the context before the LLM processes it The LLM receives the enhanced context with relevant memories ​ Usage Examples ​ Basic Integration Copy Ask AI from pipecat.services.mem0.memory import Mem0MemoryService from pipecat.pipeline.pipeline import Pipeline # Create the memory service memory = Mem0MemoryService( api_key = os.getenv( "MEM0_API_KEY" ), user_id = "user123" , # Unique user identifier ) # Position the memory service between context aggregator and LLM pipeline = Pipeline([ transport.input(), context_aggregator.user(), memory, # <-- Memory service enhances context here llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Using Local Configuration The local_config parameter allows you to use your own LLM and embedding providers instead of Mem0’s cloud API. This is useful for self-hosted deployments or when you want more control over the memory processing. Copy Ask AI local_config = { "llm" : { "provider" : str , # LLM provider name (e.g., "anthropic", "openai") "config" : { # Provider-specific configuration "model" : str , # Model name "api_key" : str , # API key for the provider # Other provider-specific parameters } }, "embedder" : { "provider" : str , # Embedding provider name (e.g., "openai") "config" : { # Provider-specific configuration "model" : str , # Model name # Other provider-specific parameters } } } # Initialize Mem0 memory service with local configuration memory = Mem0MemoryService( local_config = local_config, # Use local LLM for memory processing user_id = "user123" , # Unique identifier for the user ) When using local_config do not provide the api_key parameter. ​ Frame Flow ​ Error Handling The service includes basic error handling to ensure conversation flow continues even when memory operations fail: Exceptions during memory storage and retrieval are caught and logged If an error occurs during frame processing, an ErrorFrame is emitted with error details The original frame is still passed downstream to prevent the pipeline from stalling Connection and authentication errors from the Mem0 API will be logged but won’t interrupt the conversation While the service attempts to handle errors gracefully, memory operations that fail may result in missing context in conversations. Monitor your application logs for memory-related errors. Tavus Moondream On this page Overview Installation Mem0MemoryService Constructor Parameters Input Parameters Input Frames Output Frames Memory Operations Message Storage Memory Retrieval Pipeline Positioning Usage Examples Basic Integration Using Local Configuration Frame Flow Error Handling Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/memory_mem0_d567930b.txt b/memory_mem0_d567930b.txt new file mode 100644 index 0000000000000000000000000000000000000000..a25744215b60e9bf1f929ca2118fce36dc0bc618 --- /dev/null +++ b/memory_mem0_d567930b.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/memory/mem0#output-frames +Title: Mem0 - Pipecat +================================================== + +Mem0 - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Memory Mem0 Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Mem0 Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview Mem0MemoryService provides long-term memory capabilities for conversational agents by integrating with Mem0’s API. It automatically stores conversation history and retrieves relevant past context based on the current conversation, enhancing LLM responses with persistent memory across sessions. ​ Installation To use the Mem0 memory service, install the required dependencies: Copy Ask AI pip install "pipecat-ai[mem0]" You’ll also need to set up your Mem0 API key as an environment variable: MEM0_API_KEY . You can obtain a Mem0 API key by signing up at mem0.ai . ​ Mem0MemoryService ​ Constructor Parameters ​ api_key str required Mem0 API key for accessing the service ​ user_id str Unique identifier for the end user to associate with memories ​ agent_id str Identifier for the agent using the memory service ​ run_id str Identifier for the specific conversation session ​ params InputParams Configuration parameters for memory retrieval (see below) ​ local_config dict Configuration for using local LLMs and embedders instead of Mem0’s cloud API (see Local Configuration section) At least one of user_id , agent_id , or run_id must be provided to organize memories. ​ Input Parameters The params object accepts the following configuration settings: ​ search_limit int default: "10" Maximum number of relevant memories to retrieve per query ​ search_threshold float default: "0.1" Relevance threshold for memory retrieval (0.0 to 1.0) ​ api_version str default: "v2" Mem0 API version to use ​ system_prompt str Prefix text to add before retrieved memories ​ add_as_system_message bool default: "True" Whether to add memories as a system message (True) or user message (False) ​ position int default: "1" Position in the context where memories should be inserted ​ Input Frames The service processes the following input frames: ​ OpenAILLMContextFrame Frame Contains OpenAI-specific conversation context ​ LLMMessagesFrame Frame Contains conversation messages in standard format ​ Output Frames The service may produce the following output frames: ​ LLMMessagesFrame Frame Enhanced messages with relevant memories included ​ OpenAILLMContextFrame Frame Enhanced OpenAI context with memories included ​ ErrorFrame Frame Contains error information if memory operations fail ​ Memory Operations The service performs two main operations automatically: ​ Message Storage All conversation messages are stored in Mem0 for future reference. The service: Captures full message history from context frames Associates messages with the specified user/agent/run IDs Stores metadata to enable efficient retrieval ​ Memory Retrieval When a new user message is detected, the service: Uses the message as a search query Retrieves relevant past memories from Mem0 Formats memories with the configured system prompt Adds the formatted memories to the conversation context Passes the enhanced context downstream in the pipeline ​ Pipeline Positioning The memory service should be positioned after the user context aggregator but before the LLM service: Copy Ask AI context_aggregator.user() → memory_service → llm This ensures that: The user’s latest message is included in the context The memory service can enhance the context before the LLM processes it The LLM receives the enhanced context with relevant memories ​ Usage Examples ​ Basic Integration Copy Ask AI from pipecat.services.mem0.memory import Mem0MemoryService from pipecat.pipeline.pipeline import Pipeline # Create the memory service memory = Mem0MemoryService( api_key = os.getenv( "MEM0_API_KEY" ), user_id = "user123" , # Unique user identifier ) # Position the memory service between context aggregator and LLM pipeline = Pipeline([ transport.input(), context_aggregator.user(), memory, # <-- Memory service enhances context here llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Using Local Configuration The local_config parameter allows you to use your own LLM and embedding providers instead of Mem0’s cloud API. This is useful for self-hosted deployments or when you want more control over the memory processing. Copy Ask AI local_config = { "llm" : { "provider" : str , # LLM provider name (e.g., "anthropic", "openai") "config" : { # Provider-specific configuration "model" : str , # Model name "api_key" : str , # API key for the provider # Other provider-specific parameters } }, "embedder" : { "provider" : str , # Embedding provider name (e.g., "openai") "config" : { # Provider-specific configuration "model" : str , # Model name # Other provider-specific parameters } } } # Initialize Mem0 memory service with local configuration memory = Mem0MemoryService( local_config = local_config, # Use local LLM for memory processing user_id = "user123" , # Unique identifier for the user ) When using local_config do not provide the api_key parameter. ​ Frame Flow ​ Error Handling The service includes basic error handling to ensure conversation flow continues even when memory operations fail: Exceptions during memory storage and retrieval are caught and logged If an error occurs during frame processing, an ErrorFrame is emitted with error details The original frame is still passed downstream to prevent the pipeline from stalling Connection and authentication errors from the Mem0 API will be logged but won’t interrupt the conversation While the service attempts to handle errors gracefully, memory operations that fail may result in missing context in conversations. Monitor your application logs for memory-related errors. Tavus Moondream On this page Overview Installation Mem0MemoryService Constructor Parameters Input Parameters Input Frames Output Frames Memory Operations Message Storage Memory Retrieval Pipeline Positioning Usage Examples Basic Integration Using Local Configuration Frame Flow Error Handling Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/memory_mem0_e26ff14d.txt b/memory_mem0_e26ff14d.txt new file mode 100644 index 0000000000000000000000000000000000000000..985906e1c24bcd0f55c8b368e7bafbdad41efdf3 --- /dev/null +++ b/memory_mem0_e26ff14d.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/services/memory/mem0#overview +Title: Mem0 - Pipecat +================================================== + +Mem0 - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Memory Mem0 Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Mem0 Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview Mem0MemoryService provides long-term memory capabilities for conversational agents by integrating with Mem0’s API. It automatically stores conversation history and retrieves relevant past context based on the current conversation, enhancing LLM responses with persistent memory across sessions. ​ Installation To use the Mem0 memory service, install the required dependencies: Copy Ask AI pip install "pipecat-ai[mem0]" You’ll also need to set up your Mem0 API key as an environment variable: MEM0_API_KEY . You can obtain a Mem0 API key by signing up at mem0.ai . ​ Mem0MemoryService ​ Constructor Parameters ​ api_key str required Mem0 API key for accessing the service ​ user_id str Unique identifier for the end user to associate with memories ​ agent_id str Identifier for the agent using the memory service ​ run_id str Identifier for the specific conversation session ​ params InputParams Configuration parameters for memory retrieval (see below) ​ local_config dict Configuration for using local LLMs and embedders instead of Mem0’s cloud API (see Local Configuration section) At least one of user_id , agent_id , or run_id must be provided to organize memories. ​ Input Parameters The params object accepts the following configuration settings: ​ search_limit int default: "10" Maximum number of relevant memories to retrieve per query ​ search_threshold float default: "0.1" Relevance threshold for memory retrieval (0.0 to 1.0) ​ api_version str default: "v2" Mem0 API version to use ​ system_prompt str Prefix text to add before retrieved memories ​ add_as_system_message bool default: "True" Whether to add memories as a system message (True) or user message (False) ​ position int default: "1" Position in the context where memories should be inserted ​ Input Frames The service processes the following input frames: ​ OpenAILLMContextFrame Frame Contains OpenAI-specific conversation context ​ LLMMessagesFrame Frame Contains conversation messages in standard format ​ Output Frames The service may produce the following output frames: ​ LLMMessagesFrame Frame Enhanced messages with relevant memories included ​ OpenAILLMContextFrame Frame Enhanced OpenAI context with memories included ​ ErrorFrame Frame Contains error information if memory operations fail ​ Memory Operations The service performs two main operations automatically: ​ Message Storage All conversation messages are stored in Mem0 for future reference. The service: Captures full message history from context frames Associates messages with the specified user/agent/run IDs Stores metadata to enable efficient retrieval ​ Memory Retrieval When a new user message is detected, the service: Uses the message as a search query Retrieves relevant past memories from Mem0 Formats memories with the configured system prompt Adds the formatted memories to the conversation context Passes the enhanced context downstream in the pipeline ​ Pipeline Positioning The memory service should be positioned after the user context aggregator but before the LLM service: Copy Ask AI context_aggregator.user() → memory_service → llm This ensures that: The user’s latest message is included in the context The memory service can enhance the context before the LLM processes it The LLM receives the enhanced context with relevant memories ​ Usage Examples ​ Basic Integration Copy Ask AI from pipecat.services.mem0.memory import Mem0MemoryService from pipecat.pipeline.pipeline import Pipeline # Create the memory service memory = Mem0MemoryService( api_key = os.getenv( "MEM0_API_KEY" ), user_id = "user123" , # Unique user identifier ) # Position the memory service between context aggregator and LLM pipeline = Pipeline([ transport.input(), context_aggregator.user(), memory, # <-- Memory service enhances context here llm, tts, transport.output(), context_aggregator.assistant() ]) ​ Using Local Configuration The local_config parameter allows you to use your own LLM and embedding providers instead of Mem0’s cloud API. This is useful for self-hosted deployments or when you want more control over the memory processing. Copy Ask AI local_config = { "llm" : { "provider" : str , # LLM provider name (e.g., "anthropic", "openai") "config" : { # Provider-specific configuration "model" : str , # Model name "api_key" : str , # API key for the provider # Other provider-specific parameters } }, "embedder" : { "provider" : str , # Embedding provider name (e.g., "openai") "config" : { # Provider-specific configuration "model" : str , # Model name # Other provider-specific parameters } } } # Initialize Mem0 memory service with local configuration memory = Mem0MemoryService( local_config = local_config, # Use local LLM for memory processing user_id = "user123" , # Unique identifier for the user ) When using local_config do not provide the api_key parameter. ​ Frame Flow ​ Error Handling The service includes basic error handling to ensure conversation flow continues even when memory operations fail: Exceptions during memory storage and retrieval are caught and logged If an error occurs during frame processing, an ErrorFrame is emitted with error details The original frame is still passed downstream to prevent the pipeline from stalling Connection and authentication errors from the Mem0 API will be logged but won’t interrupt the conversation While the service attempts to handle errors gracefully, memory operations that fail may result in missing context in conversations. Monitor your application logs for memory-related errors. Tavus Moondream On this page Overview Installation Mem0MemoryService Constructor Parameters Input Parameters Input Frames Output Frames Memory Operations Message Storage Memory Retrieval Pipeline Positioning Usage Examples Basic Integration Using Local Configuration Frame Flow Error Handling Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/observers_debug-observer_85f3b041.txt b/observers_debug-observer_85f3b041.txt new file mode 100644 index 0000000000000000000000000000000000000000..f38636368e943303602f24714ddfaac991bea8ee --- /dev/null +++ b/observers_debug-observer_85f3b041.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/observers/debug-observer#log-output-format +Title: Debug Log Observer - Pipecat +================================================== + +Debug Log Observer - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Observers Debug Log Observer Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Observer Pattern Debug Observer LLM Observer Transcription Observer Turn Tracking Observer Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline The DebugLogObserver provides detailed logging of frame activity in your Pipecat pipeline, with full visibility into frame content and flexible filtering options. ​ Features Log all frame types and their content Filter by specific frame types Filter by source or destination components Automatic formatting of frame fields Special handling for complex data structures ​ Usage ​ Log All Frames Log all frames passing through the pipeline: Copy Ask AI from pipecat.observers.loggers.debug_log_observer import DebugLogObserver task = PipelineTask( pipeline, params = PipelineParams( observers = [DebugLogObserver()], ), ) ​ Filter by Frame Types Log only specific frame types: Copy Ask AI from pipecat.frames.frames import TranscriptionFrame, InterimTranscriptionFrame from pipecat.observers.loggers.debug_log_observer import DebugLogObserver task = PipelineTask( pipeline, params = PipelineParams( observers = [ DebugLogObserver( frame_types = ( TranscriptionFrame, InterimTranscriptionFrame )) ], ), ) ​ Advanced Source/Destination Filtering Filter frames based on their type and source/destination: Copy Ask AI from pipecat.frames.frames import StartInterruptionFrame, UserStartedSpeakingFrame, LLMTextFrame from pipecat.observers.loggers.debug_log_observer import DebugLogObserver, FrameEndpoint from pipecat.transports.base_output_transport import BaseOutputTransport from pipecat.services.stt_service import STTService task = PipelineTask( pipeline, params = PipelineParams( observers = [ DebugLogObserver( frame_types = { # Only log StartInterruptionFrame when source is BaseOutputTransport StartInterruptionFrame: (BaseOutputTransport, FrameEndpoint. SOURCE ), # Only log UserStartedSpeakingFrame when destination is STTService UserStartedSpeakingFrame: (STTService, FrameEndpoint. DESTINATION ), # Log LLMTextFrame regardless of source or destination LLMTextFrame: None }) ], ), ) ​ Log Output Format The observer logs each frame with its complete details: Copy Ask AI [Source] → [Destination]: [FrameType] [field1: value1, field2: value2, ...] at [timestamp]s For example: Copy Ask AI OpenAILLMService#0 → DailyTransport#0: LLMTextFrame text: 'Hello, how can I help you today?' at 1.24s ​ Configuration Options Parameter Type Description frame_types Tuple[Type[Frame], ...] or Dict[Type[Frame], Optional[Tuple[Type, FrameEndpoint]]] Frame types to log, with optional source/destination filtering exclude_fields Set[str] Field names to exclude from logging (defaults to binary fields) ​ FrameEndpoint Enum The FrameEndpoint enum is used for source/destination filtering: FrameEndpoint.SOURCE : Filter by source component FrameEndpoint.DESTINATION : Filter by destination component Observer Pattern LLM Observer On this page Features Usage Log All Frames Filter by Frame Types Advanced Source/Destination Filtering Log Output Format Configuration Options FrameEndpoint Enum Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/observers_observer-pattern_6dc4f572.txt b/observers_observer-pattern_6dc4f572.txt new file mode 100644 index 0000000000000000000000000000000000000000..b592914242b6fb6d2d9042dc2e8bf128e4cfa8f3 --- /dev/null +++ b/observers_observer-pattern_6dc4f572.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/observers/observer-pattern#base-observer +Title: Observer Pattern - Pipecat +================================================== + +Observer Pattern - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Observers Observer Pattern Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Observer Pattern Debug Observer LLM Observer Transcription Observer Turn Tracking Observer Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline The Observer pattern in Pipecat allows non-intrusive monitoring of frames as they flow through the pipeline. Observers can watch frame traffic without affecting the pipeline’s core functionality. ​ Base Observer All observers must inherit from BaseObserver and implement the on_push_frame method: Copy Ask AI from pipecat.observers.base_observer import BaseObserver class CustomObserver ( BaseObserver ): async def on_push_frame ( self , src : FrameProcessor, dst : FrameProcessor, frame : Frame, direction : FrameDirection, timestamp : int , ): # Your frame observation logic here pass ​ Available Observers Pipecat provides several built-in observers: LLMLogObserver : Logs LLM activity and responses TranscriptionLogObserver : Logs speech-to-text transcription events RTVIObserver : Converts internal frames to RTVI protocol messages for server to client messaging ​ Using Multiple Observers You can attach multiple observers to a pipeline task. Each observer will be notified of all frames: Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( observers = [LLMLogObserver(), TranscriptionLogObserver(), CustomObserver()], ), ) ​ Example: Debug Observer Here’s an example observer that logs interruptions and bot speaking events: Copy Ask AI class DebugObserver ( BaseObserver ): """Observer to log interruptions and bot speaking events to the console. Logs all frame instances of: - StartInterruptionFrame - BotStartedSpeakingFrame - BotStoppedSpeakingFrame This allows you to see the frame flow from processor to processor through the pipeline for these frames. Log format: [EVENT TYPE]: [source processor] → [destination processor] at [timestamp]s """ async def on_push_frame ( self , src : FrameProcessor, dst : FrameProcessor, frame : Frame, direction : FrameDirection, timestamp : int , ): time_sec = timestamp / 1_000_000_000 arrow = "→" if direction == FrameDirection. DOWNSTREAM else "←" if isinstance (frame, StartInterruptionFrame): logger.info( f "⚡ INTERRUPTION START: { src } { arrow } { dst } at { time_sec :.2f} s" ) elif isinstance (frame, BotStartedSpeakingFrame): logger.info( f "🤖 BOT START SPEAKING: { src } { arrow } { dst } at { time_sec :.2f} s" ) elif isinstance (frame, BotStoppedSpeakingFrame): logger.info( f "🤖 BOT STOP SPEAKING: { src } { arrow } { dst } at { time_sec :.2f} s" ) ​ Common Use Cases Observers are particularly useful for: Debugging frame flow Logging specific events Monitoring pipeline behavior Collecting metrics Converting internal frames to external messages MCPClient Debug Observer On this page Base Observer Available Observers Using Multiple Observers Example: Debug Observer Common Use Cases Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/observers_turn-tracking-observer_1a144d0e.txt b/observers_turn-tracking-observer_1a144d0e.txt new file mode 100644 index 0000000000000000000000000000000000000000..0a48562713adfaeadebf1b0c4eae63e81c96ad0c --- /dev/null +++ b/observers_turn-tracking-observer_1a144d0e.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/observers/turn-tracking-observer +Title: Turn Tracking Observer - Pipecat +================================================== + +Turn Tracking Observer - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Observers Turn Tracking Observer Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Observer Pattern Debug Observer LLM Observer Transcription Observer Turn Tracking Observer Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline The TurnTrackingObserver monitors and tracks conversational turns in your Pipecat pipeline, providing events when turns start and end. It intelligently identifies when a user-bot interaction cycle begins and completes. ​ Turn Lifecycle A turn represents a complete user-bot interaction cycle: Start : When the user starts speaking (or pipeline starts for first turn) Processing : User speaks, bot processes and responds End : After the bot finishes speaking and either: The user starts speaking again A timeout period elapses with no further activity ​ Events The observer emits two main events: on_turn_started : When a new turn begins Parameters: turn_number (int) on_turn_ended : When a turn completes Parameters: turn_number (int), duration (float, in seconds), was_interrupted (bool) ​ Usage The observer is automatically created when you initialize a PipelineTask with enable_turn_tracking=True (which is the default): Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( allow_interruptions = True ), # Turn tracking is enabled by default ) # Access the observer turn_observer = task.turn_tracking_observer # Register event handlers @turn_observer.event_handler ( "on_turn_started" ) async def on_turn_started ( observer , turn_number ): logger.info( f "Turn { turn_number } started" ) @turn_observer.event_handler ( "on_turn_ended" ) async def on_turn_ended ( observer , turn_number , duration , was_interrupted ): status = "interrupted" if was_interrupted else "completed" logger.info( f "Turn { turn_number } { status } in { duration :.2f} s" ) ​ Configuration You can configure the observer’s behavior when creating a PipelineTask : Copy Ask AI from pipecat.observers.turn_tracking_observer import TurnTrackingObserver # Create a custom observer instance custom_turn_tracker = TurnTrackingObserver( turn_end_timeout_secs = 3.5 , # Turn end timeout (default: 2.5) ) # Add it as a regular observer task = PipelineTask( pipeline, observers = [custom_turn_tracker], # Disable the default one if adding your own enable_turn_tracking = False , ) ​ Interruptions The observer automatically detects interruptions when the user starts speaking while the bot is still speaking. In this case: The current turn is marked as interrupted ( was_interrupted=True ) A new turn begins immediately ​ How It Works The observer monitors specific frame types to track conversation flow: StartFrame : Initiates the first turn UserStartedSpeakingFrame : Starts user speech or triggers a new turn BotStartedSpeakingFrame : Marks bot speech beginning BotStoppedSpeakingFrame : Starts the turn end timeout After a bot stops speaking, the observer waits for the configured timeout period. If no further bot speech occurs, the turn ends; otherwise, it continues as part of the same turn. ​ Use Cases Analytics : Measure turn durations, interruption rates, and conversation flow Logging : Record turn-based logs for diagnostics and analysis Visualization : Show turn-based conversation timelines in UIs Tracing : Group spans and metrics by conversation turns Transcription Observer Daily REST Helper On this page Turn Lifecycle Events Usage Configuration Interruptions How It Works Use Cases Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/observers_turn-tracking-observer_1f4e11c8.txt b/observers_turn-tracking-observer_1f4e11c8.txt new file mode 100644 index 0000000000000000000000000000000000000000..7656eca3066a8d5044092e5406dc608096c2f8fb --- /dev/null +++ b/observers_turn-tracking-observer_1f4e11c8.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/observers/turn-tracking-observer#configuration +Title: Turn Tracking Observer - Pipecat +================================================== + +Turn Tracking Observer - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Observers Turn Tracking Observer Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Observer Pattern Debug Observer LLM Observer Transcription Observer Turn Tracking Observer Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline The TurnTrackingObserver monitors and tracks conversational turns in your Pipecat pipeline, providing events when turns start and end. It intelligently identifies when a user-bot interaction cycle begins and completes. ​ Turn Lifecycle A turn represents a complete user-bot interaction cycle: Start : When the user starts speaking (or pipeline starts for first turn) Processing : User speaks, bot processes and responds End : After the bot finishes speaking and either: The user starts speaking again A timeout period elapses with no further activity ​ Events The observer emits two main events: on_turn_started : When a new turn begins Parameters: turn_number (int) on_turn_ended : When a turn completes Parameters: turn_number (int), duration (float, in seconds), was_interrupted (bool) ​ Usage The observer is automatically created when you initialize a PipelineTask with enable_turn_tracking=True (which is the default): Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( allow_interruptions = True ), # Turn tracking is enabled by default ) # Access the observer turn_observer = task.turn_tracking_observer # Register event handlers @turn_observer.event_handler ( "on_turn_started" ) async def on_turn_started ( observer , turn_number ): logger.info( f "Turn { turn_number } started" ) @turn_observer.event_handler ( "on_turn_ended" ) async def on_turn_ended ( observer , turn_number , duration , was_interrupted ): status = "interrupted" if was_interrupted else "completed" logger.info( f "Turn { turn_number } { status } in { duration :.2f} s" ) ​ Configuration You can configure the observer’s behavior when creating a PipelineTask : Copy Ask AI from pipecat.observers.turn_tracking_observer import TurnTrackingObserver # Create a custom observer instance custom_turn_tracker = TurnTrackingObserver( turn_end_timeout_secs = 3.5 , # Turn end timeout (default: 2.5) ) # Add it as a regular observer task = PipelineTask( pipeline, observers = [custom_turn_tracker], # Disable the default one if adding your own enable_turn_tracking = False , ) ​ Interruptions The observer automatically detects interruptions when the user starts speaking while the bot is still speaking. In this case: The current turn is marked as interrupted ( was_interrupted=True ) A new turn begins immediately ​ How It Works The observer monitors specific frame types to track conversation flow: StartFrame : Initiates the first turn UserStartedSpeakingFrame : Starts user speech or triggers a new turn BotStartedSpeakingFrame : Marks bot speech beginning BotStoppedSpeakingFrame : Starts the turn end timeout After a bot stops speaking, the observer waits for the configured timeout period. If no further bot speech occurs, the turn ends; otherwise, it continues as part of the same turn. ​ Use Cases Analytics : Measure turn durations, interruption rates, and conversation flow Logging : Record turn-based logs for diagnostics and analysis Visualization : Show turn-based conversation timelines in UIs Tracing : Group spans and metrics by conversation turns Transcription Observer Daily REST Helper On this page Turn Lifecycle Events Usage Configuration Interruptions How It Works Use Cases Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/observers_turn-tracking-observer_cf0a6a7a.txt b/observers_turn-tracking-observer_cf0a6a7a.txt new file mode 100644 index 0000000000000000000000000000000000000000..c770327f92bf0c2f785500f732ea7cd8900940c5 --- /dev/null +++ b/observers_turn-tracking-observer_cf0a6a7a.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/utilities/observers/turn-tracking-observer#usage +Title: Turn Tracking Observer - Pipecat +================================================== + +Turn Tracking Observer - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Observers Turn Tracking Observer Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Observer Pattern Debug Observer LLM Observer Transcription Observer Turn Tracking Observer Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline The TurnTrackingObserver monitors and tracks conversational turns in your Pipecat pipeline, providing events when turns start and end. It intelligently identifies when a user-bot interaction cycle begins and completes. ​ Turn Lifecycle A turn represents a complete user-bot interaction cycle: Start : When the user starts speaking (or pipeline starts for first turn) Processing : User speaks, bot processes and responds End : After the bot finishes speaking and either: The user starts speaking again A timeout period elapses with no further activity ​ Events The observer emits two main events: on_turn_started : When a new turn begins Parameters: turn_number (int) on_turn_ended : When a turn completes Parameters: turn_number (int), duration (float, in seconds), was_interrupted (bool) ​ Usage The observer is automatically created when you initialize a PipelineTask with enable_turn_tracking=True (which is the default): Copy Ask AI task = PipelineTask( pipeline, params = PipelineParams( allow_interruptions = True ), # Turn tracking is enabled by default ) # Access the observer turn_observer = task.turn_tracking_observer # Register event handlers @turn_observer.event_handler ( "on_turn_started" ) async def on_turn_started ( observer , turn_number ): logger.info( f "Turn { turn_number } started" ) @turn_observer.event_handler ( "on_turn_ended" ) async def on_turn_ended ( observer , turn_number , duration , was_interrupted ): status = "interrupted" if was_interrupted else "completed" logger.info( f "Turn { turn_number } { status } in { duration :.2f} s" ) ​ Configuration You can configure the observer’s behavior when creating a PipelineTask : Copy Ask AI from pipecat.observers.turn_tracking_observer import TurnTrackingObserver # Create a custom observer instance custom_turn_tracker = TurnTrackingObserver( turn_end_timeout_secs = 3.5 , # Turn end timeout (default: 2.5) ) # Add it as a regular observer task = PipelineTask( pipeline, observers = [custom_turn_tracker], # Disable the default one if adding your own enable_turn_tracking = False , ) ​ Interruptions The observer automatically detects interruptions when the user starts speaking while the bot is still speaking. In this case: The current turn is marked as interrupted ( was_interrupted=True ) A new turn begins immediately ​ How It Works The observer monitors specific frame types to track conversation flow: StartFrame : Initiates the first turn UserStartedSpeakingFrame : Starts user speech or triggers a new turn BotStartedSpeakingFrame : Marks bot speech beginning BotStoppedSpeakingFrame : Starts the turn end timeout After a bot stops speaking, the observer waits for the configured timeout period. If no further bot speech occurs, the turn ends; otherwise, it continues as part of the same turn. ​ Use Cases Analytics : Measure turn durations, interruption rates, and conversation flow Logging : Record turn-based logs for diagnostics and analysis Visualization : Show turn-based conversation timelines in UIs Tracing : Group spans and metrics by conversation turns Transcription Observer Daily REST Helper On this page Turn Lifecycle Events Usage Configuration Interruptions How It Works Use Cases Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipecat-client-android_indexhtml_832410bf.txt b/pipecat-client-android_indexhtml_832410bf.txt new file mode 100644 index 0000000000000000000000000000000000000000..8b3841d2498153bfecda8af4a8de297b104c38be --- /dev/null +++ b/pipecat-client-android_indexhtml_832410bf.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/client/android/pipecat-client-android/index.html#how-it-works +Title: Overview - Pipecat +================================================== + +Overview - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Get Started Overview Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Get Started Overview Installation & Setup Quickstart Core Concepts Next Steps & Examples Pipecat is an open source Python framework that handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions. “Multimodal” means you can use any combination of audio, video, images, and/or text in your interactions. And “real-time” means that things are happening quickly enough that it feels conversational—a “back-and-forth” with a bot, not submitting a query and waiting for results. ​ What You Can Build Voice Assistants Natural, real-time conversations with AI using speech recognition and synthesis Interactive Agents Personal coaches and meeting assistants that can understand context and provide guidance Multimodal Apps Applications that combine voice, video, images, and text for rich interactions Creative Tools Storytelling experiences and social companions that engage users Business Solutions Customer intake flows and support bots for automated business processes Complex Flows Structured conversations using Pipecat Flows for managing complex interactions ​ How It Works The flow of interactions in a Pipecat application is typically straightforward: The bot says something The user says something The bot says something The user says something This continues until the conversation naturally ends. While this flow seems simple, making it feel natural requires sophisticated real-time processing. ​ Real-time Processing Pipecat’s pipeline architecture handles both simple voice interactions and complex multimodal processing. Let’s look at how data flows through the system: Voice app Multimodal app 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio and Video Transmit and capture audio, video, and image inputs simultaneously 2 Process Streams Handle multiple input streams in parallel 3 Model Processing Send combined inputs to multimodal models (like GPT-4V) 4 Generate Outputs Create various outputs (text, images, audio, etc.) 5 Coordinate Presentation Synchronize and present multiple output types In both cases, Pipecat: Processes responses as they stream in Handles multiple input/output modalities concurrently Manages resource allocation and synchronization Coordinates parallel processing tasks This architecture creates fluid, natural interactions without noticeable delays, whether you’re building a simple voice assistant or a complex multimodal application. Pipecat’s pipeline architecture is particularly valuable for managing the complexity of real-time, multimodal interactions, ensuring smooth data flow and proper synchronization regardless of the input/output types involved. Pipecat handles all this complexity for you, letting you focus on building your application rather than managing the underlying infrastructure. ​ Next Steps Ready to build your first Pipecat application? Installation & Setup Prepare your environment and install required dependencies Quickstart Build and run your first Pipecat application Core Concepts Learn about pipelines, frames, and real-time processing Use Cases Explore example implementations and patterns ​ Join Our Community Discord Community Connect with other developers, share your projects, and get support from the Pipecat team. Installation & Setup On this page What You Can Build How It Works Real-time Processing Next Steps Join Our Community Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipecat-client-android_indexhtml_a38c5b88.txt b/pipecat-client-android_indexhtml_a38c5b88.txt new file mode 100644 index 0000000000000000000000000000000000000000..f0b7316bf072ec5a33c88f4fce36ed12c22b720c --- /dev/null +++ b/pipecat-client-android_indexhtml_a38c5b88.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/client/android/pipecat-client-android/index.html +Title: Overview - Pipecat +================================================== + +Overview - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Get Started Overview Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Get Started Overview Installation & Setup Quickstart Core Concepts Next Steps & Examples Pipecat is an open source Python framework that handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions. “Multimodal” means you can use any combination of audio, video, images, and/or text in your interactions. And “real-time” means that things are happening quickly enough that it feels conversational—a “back-and-forth” with a bot, not submitting a query and waiting for results. ​ What You Can Build Voice Assistants Natural, real-time conversations with AI using speech recognition and synthesis Interactive Agents Personal coaches and meeting assistants that can understand context and provide guidance Multimodal Apps Applications that combine voice, video, images, and text for rich interactions Creative Tools Storytelling experiences and social companions that engage users Business Solutions Customer intake flows and support bots for automated business processes Complex Flows Structured conversations using Pipecat Flows for managing complex interactions ​ How It Works The flow of interactions in a Pipecat application is typically straightforward: The bot says something The user says something The bot says something The user says something This continues until the conversation naturally ends. While this flow seems simple, making it feel natural requires sophisticated real-time processing. ​ Real-time Processing Pipecat’s pipeline architecture handles both simple voice interactions and complex multimodal processing. Let’s look at how data flows through the system: Voice app Multimodal app 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio and Video Transmit and capture audio, video, and image inputs simultaneously 2 Process Streams Handle multiple input streams in parallel 3 Model Processing Send combined inputs to multimodal models (like GPT-4V) 4 Generate Outputs Create various outputs (text, images, audio, etc.) 5 Coordinate Presentation Synchronize and present multiple output types In both cases, Pipecat: Processes responses as they stream in Handles multiple input/output modalities concurrently Manages resource allocation and synchronization Coordinates parallel processing tasks This architecture creates fluid, natural interactions without noticeable delays, whether you’re building a simple voice assistant or a complex multimodal application. Pipecat’s pipeline architecture is particularly valuable for managing the complexity of real-time, multimodal interactions, ensuring smooth data flow and proper synchronization regardless of the input/output types involved. Pipecat handles all this complexity for you, letting you focus on building your application rather than managing the underlying infrastructure. ​ Next Steps Ready to build your first Pipecat application? Installation & Setup Prepare your environment and install required dependencies Quickstart Build and run your first Pipecat application Core Concepts Learn about pipelines, frames, and real-time processing Use Cases Explore example implementations and patterns ​ Join Our Community Discord Community Connect with other developers, share your projects, and get support from the Pipecat team. Installation & Setup On this page What You Can Build How It Works Real-time Processing Next Steps Join Our Community Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipecat-transport-daily_indexhtml_17ab5517.txt b/pipecat-transport-daily_indexhtml_17ab5517.txt new file mode 100644 index 0000000000000000000000000000000000000000..804b151f42a84cb3509b92148a2eaf5968f48985 --- /dev/null +++ b/pipecat-transport-daily_indexhtml_17ab5517.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/client/android/pipecat-transport-daily/index.html#join-our-community +Title: Overview - Pipecat +================================================== + +Overview - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Get Started Overview Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Get Started Overview Installation & Setup Quickstart Core Concepts Next Steps & Examples Pipecat is an open source Python framework that handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions. “Multimodal” means you can use any combination of audio, video, images, and/or text in your interactions. And “real-time” means that things are happening quickly enough that it feels conversational—a “back-and-forth” with a bot, not submitting a query and waiting for results. ​ What You Can Build Voice Assistants Natural, real-time conversations with AI using speech recognition and synthesis Interactive Agents Personal coaches and meeting assistants that can understand context and provide guidance Multimodal Apps Applications that combine voice, video, images, and text for rich interactions Creative Tools Storytelling experiences and social companions that engage users Business Solutions Customer intake flows and support bots for automated business processes Complex Flows Structured conversations using Pipecat Flows for managing complex interactions ​ How It Works The flow of interactions in a Pipecat application is typically straightforward: The bot says something The user says something The bot says something The user says something This continues until the conversation naturally ends. While this flow seems simple, making it feel natural requires sophisticated real-time processing. ​ Real-time Processing Pipecat’s pipeline architecture handles both simple voice interactions and complex multimodal processing. Let’s look at how data flows through the system: Voice app Multimodal app 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio and Video Transmit and capture audio, video, and image inputs simultaneously 2 Process Streams Handle multiple input streams in parallel 3 Model Processing Send combined inputs to multimodal models (like GPT-4V) 4 Generate Outputs Create various outputs (text, images, audio, etc.) 5 Coordinate Presentation Synchronize and present multiple output types In both cases, Pipecat: Processes responses as they stream in Handles multiple input/output modalities concurrently Manages resource allocation and synchronization Coordinates parallel processing tasks This architecture creates fluid, natural interactions without noticeable delays, whether you’re building a simple voice assistant or a complex multimodal application. Pipecat’s pipeline architecture is particularly valuable for managing the complexity of real-time, multimodal interactions, ensuring smooth data flow and proper synchronization regardless of the input/output types involved. Pipecat handles all this complexity for you, letting you focus on building your application rather than managing the underlying infrastructure. ​ Next Steps Ready to build your first Pipecat application? Installation & Setup Prepare your environment and install required dependencies Quickstart Build and run your first Pipecat application Core Concepts Learn about pipelines, frames, and real-time processing Use Cases Explore example implementations and patterns ​ Join Our Community Discord Community Connect with other developers, share your projects, and get support from the Pipecat team. Installation & Setup On this page What You Can Build How It Works Real-time Processing Next Steps Join Our Community Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipecat-transport-daily_indexhtml_26a3cb3b.txt b/pipecat-transport-daily_indexhtml_26a3cb3b.txt new file mode 100644 index 0000000000000000000000000000000000000000..e13404007c6c887fa8b1876c64ece678557e1644 --- /dev/null +++ b/pipecat-transport-daily_indexhtml_26a3cb3b.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/client/android/pipecat-transport-daily/index.html#what-you-can-build +Title: Overview - Pipecat +================================================== + +Overview - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Get Started Overview Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Get Started Overview Installation & Setup Quickstart Core Concepts Next Steps & Examples Pipecat is an open source Python framework that handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions. “Multimodal” means you can use any combination of audio, video, images, and/or text in your interactions. And “real-time” means that things are happening quickly enough that it feels conversational—a “back-and-forth” with a bot, not submitting a query and waiting for results. ​ What You Can Build Voice Assistants Natural, real-time conversations with AI using speech recognition and synthesis Interactive Agents Personal coaches and meeting assistants that can understand context and provide guidance Multimodal Apps Applications that combine voice, video, images, and text for rich interactions Creative Tools Storytelling experiences and social companions that engage users Business Solutions Customer intake flows and support bots for automated business processes Complex Flows Structured conversations using Pipecat Flows for managing complex interactions ​ How It Works The flow of interactions in a Pipecat application is typically straightforward: The bot says something The user says something The bot says something The user says something This continues until the conversation naturally ends. While this flow seems simple, making it feel natural requires sophisticated real-time processing. ​ Real-time Processing Pipecat’s pipeline architecture handles both simple voice interactions and complex multimodal processing. Let’s look at how data flows through the system: Voice app Multimodal app 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio and Video Transmit and capture audio, video, and image inputs simultaneously 2 Process Streams Handle multiple input streams in parallel 3 Model Processing Send combined inputs to multimodal models (like GPT-4V) 4 Generate Outputs Create various outputs (text, images, audio, etc.) 5 Coordinate Presentation Synchronize and present multiple output types In both cases, Pipecat: Processes responses as they stream in Handles multiple input/output modalities concurrently Manages resource allocation and synchronization Coordinates parallel processing tasks This architecture creates fluid, natural interactions without noticeable delays, whether you’re building a simple voice assistant or a complex multimodal application. Pipecat’s pipeline architecture is particularly valuable for managing the complexity of real-time, multimodal interactions, ensuring smooth data flow and proper synchronization regardless of the input/output types involved. Pipecat handles all this complexity for you, letting you focus on building your application rather than managing the underlying infrastructure. ​ Next Steps Ready to build your first Pipecat application? Installation & Setup Prepare your environment and install required dependencies Quickstart Build and run your first Pipecat application Core Concepts Learn about pipelines, frames, and real-time processing Use Cases Explore example implementations and patterns ​ Join Our Community Discord Community Connect with other developers, share your projects, and get support from the Pipecat team. Installation & Setup On this page What You Can Build How It Works Real-time Processing Next Steps Join Our Community Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipecat-transport-daily_indexhtml_97e010d2.txt b/pipecat-transport-daily_indexhtml_97e010d2.txt new file mode 100644 index 0000000000000000000000000000000000000000..e88f85990aadcb618b32270f1b99ecf144374ea5 --- /dev/null +++ b/pipecat-transport-daily_indexhtml_97e010d2.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/client/android/pipecat-transport-daily/index.html#next-steps +Title: Overview - Pipecat +================================================== + +Overview - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Get Started Overview Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Get Started Overview Installation & Setup Quickstart Core Concepts Next Steps & Examples Pipecat is an open source Python framework that handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions. “Multimodal” means you can use any combination of audio, video, images, and/or text in your interactions. And “real-time” means that things are happening quickly enough that it feels conversational—a “back-and-forth” with a bot, not submitting a query and waiting for results. ​ What You Can Build Voice Assistants Natural, real-time conversations with AI using speech recognition and synthesis Interactive Agents Personal coaches and meeting assistants that can understand context and provide guidance Multimodal Apps Applications that combine voice, video, images, and text for rich interactions Creative Tools Storytelling experiences and social companions that engage users Business Solutions Customer intake flows and support bots for automated business processes Complex Flows Structured conversations using Pipecat Flows for managing complex interactions ​ How It Works The flow of interactions in a Pipecat application is typically straightforward: The bot says something The user says something The bot says something The user says something This continues until the conversation naturally ends. While this flow seems simple, making it feel natural requires sophisticated real-time processing. ​ Real-time Processing Pipecat’s pipeline architecture handles both simple voice interactions and complex multimodal processing. Let’s look at how data flows through the system: Voice app Multimodal app 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio and Video Transmit and capture audio, video, and image inputs simultaneously 2 Process Streams Handle multiple input streams in parallel 3 Model Processing Send combined inputs to multimodal models (like GPT-4V) 4 Generate Outputs Create various outputs (text, images, audio, etc.) 5 Coordinate Presentation Synchronize and present multiple output types In both cases, Pipecat: Processes responses as they stream in Handles multiple input/output modalities concurrently Manages resource allocation and synchronization Coordinates parallel processing tasks This architecture creates fluid, natural interactions without noticeable delays, whether you’re building a simple voice assistant or a complex multimodal application. Pipecat’s pipeline architecture is particularly valuable for managing the complexity of real-time, multimodal interactions, ensuring smooth data flow and proper synchronization regardless of the input/output types involved. Pipecat handles all this complexity for you, letting you focus on building your application rather than managing the underlying infrastructure. ​ Next Steps Ready to build your first Pipecat application? Installation & Setup Prepare your environment and install required dependencies Quickstart Build and run your first Pipecat application Core Concepts Learn about pipelines, frames, and real-time processing Use Cases Explore example implementations and patterns ​ Join Our Community Discord Community Connect with other developers, share your projects, and get support from the Pipecat team. Installation & Setup On this page What You Can Build How It Works Real-time Processing Next Steps Join Our Community Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipecat-transport-gemini-live-websocket_indexhtml_774e88b2.txt b/pipecat-transport-gemini-live-websocket_indexhtml_774e88b2.txt new file mode 100644 index 0000000000000000000000000000000000000000..b7ee72b3fd50c243b2545814982cc5864819a501 --- /dev/null +++ b/pipecat-transport-gemini-live-websocket_indexhtml_774e88b2.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/client/android/pipecat-transport-gemini-live-websocket/index.html#what-you-can-build +Title: Overview - Pipecat +================================================== + +Overview - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Get Started Overview Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Get Started Overview Installation & Setup Quickstart Core Concepts Next Steps & Examples Pipecat is an open source Python framework that handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions. “Multimodal” means you can use any combination of audio, video, images, and/or text in your interactions. And “real-time” means that things are happening quickly enough that it feels conversational—a “back-and-forth” with a bot, not submitting a query and waiting for results. ​ What You Can Build Voice Assistants Natural, real-time conversations with AI using speech recognition and synthesis Interactive Agents Personal coaches and meeting assistants that can understand context and provide guidance Multimodal Apps Applications that combine voice, video, images, and text for rich interactions Creative Tools Storytelling experiences and social companions that engage users Business Solutions Customer intake flows and support bots for automated business processes Complex Flows Structured conversations using Pipecat Flows for managing complex interactions ​ How It Works The flow of interactions in a Pipecat application is typically straightforward: The bot says something The user says something The bot says something The user says something This continues until the conversation naturally ends. While this flow seems simple, making it feel natural requires sophisticated real-time processing. ​ Real-time Processing Pipecat’s pipeline architecture handles both simple voice interactions and complex multimodal processing. Let’s look at how data flows through the system: Voice app Multimodal app 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio and Video Transmit and capture audio, video, and image inputs simultaneously 2 Process Streams Handle multiple input streams in parallel 3 Model Processing Send combined inputs to multimodal models (like GPT-4V) 4 Generate Outputs Create various outputs (text, images, audio, etc.) 5 Coordinate Presentation Synchronize and present multiple output types In both cases, Pipecat: Processes responses as they stream in Handles multiple input/output modalities concurrently Manages resource allocation and synchronization Coordinates parallel processing tasks This architecture creates fluid, natural interactions without noticeable delays, whether you’re building a simple voice assistant or a complex multimodal application. Pipecat’s pipeline architecture is particularly valuable for managing the complexity of real-time, multimodal interactions, ensuring smooth data flow and proper synchronization regardless of the input/output types involved. Pipecat handles all this complexity for you, letting you focus on building your application rather than managing the underlying infrastructure. ​ Next Steps Ready to build your first Pipecat application? Installation & Setup Prepare your environment and install required dependencies Quickstart Build and run your first Pipecat application Core Concepts Learn about pipelines, frames, and real-time processing Use Cases Explore example implementations and patterns ​ Join Our Community Discord Community Connect with other developers, share your projects, and get support from the Pipecat team. Installation & Setup On this page What You Can Build How It Works Real-time Processing Next Steps Join Our Community Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipecat-transport-gemini-live-websocket_indexhtml_da9d07d0.txt b/pipecat-transport-gemini-live-websocket_indexhtml_da9d07d0.txt new file mode 100644 index 0000000000000000000000000000000000000000..9be3a11eb590bc43d67e2a560defd6739dc72307 --- /dev/null +++ b/pipecat-transport-gemini-live-websocket_indexhtml_da9d07d0.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/client/android/pipecat-transport-gemini-live-websocket/index.html#how-it-works +Title: Overview - Pipecat +================================================== + +Overview - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Get Started Overview Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Get Started Overview Installation & Setup Quickstart Core Concepts Next Steps & Examples Pipecat is an open source Python framework that handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions. “Multimodal” means you can use any combination of audio, video, images, and/or text in your interactions. And “real-time” means that things are happening quickly enough that it feels conversational—a “back-and-forth” with a bot, not submitting a query and waiting for results. ​ What You Can Build Voice Assistants Natural, real-time conversations with AI using speech recognition and synthesis Interactive Agents Personal coaches and meeting assistants that can understand context and provide guidance Multimodal Apps Applications that combine voice, video, images, and text for rich interactions Creative Tools Storytelling experiences and social companions that engage users Business Solutions Customer intake flows and support bots for automated business processes Complex Flows Structured conversations using Pipecat Flows for managing complex interactions ​ How It Works The flow of interactions in a Pipecat application is typically straightforward: The bot says something The user says something The bot says something The user says something This continues until the conversation naturally ends. While this flow seems simple, making it feel natural requires sophisticated real-time processing. ​ Real-time Processing Pipecat’s pipeline architecture handles both simple voice interactions and complex multimodal processing. Let’s look at how data flows through the system: Voice app Multimodal app 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio and Video Transmit and capture audio, video, and image inputs simultaneously 2 Process Streams Handle multiple input streams in parallel 3 Model Processing Send combined inputs to multimodal models (like GPT-4V) 4 Generate Outputs Create various outputs (text, images, audio, etc.) 5 Coordinate Presentation Synchronize and present multiple output types In both cases, Pipecat: Processes responses as they stream in Handles multiple input/output modalities concurrently Manages resource allocation and synchronization Coordinates parallel processing tasks This architecture creates fluid, natural interactions without noticeable delays, whether you’re building a simple voice assistant or a complex multimodal application. Pipecat’s pipeline architecture is particularly valuable for managing the complexity of real-time, multimodal interactions, ensuring smooth data flow and proper synchronization regardless of the input/output types involved. Pipecat handles all this complexity for you, letting you focus on building your application rather than managing the underlying infrastructure. ​ Next Steps Ready to build your first Pipecat application? Installation & Setup Prepare your environment and install required dependencies Quickstart Build and run your first Pipecat application Core Concepts Learn about pipelines, frames, and real-time processing Use Cases Explore example implementations and patterns ​ Join Our Community Discord Community Connect with other developers, share your projects, and get support from the Pipecat team. Installation & Setup On this page What You Can Build How It Works Real-time Processing Next Steps Join Our Community Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipecat-transport-openai-realtime-webrtc_indexhtml_bf6a944e.txt b/pipecat-transport-openai-realtime-webrtc_indexhtml_bf6a944e.txt new file mode 100644 index 0000000000000000000000000000000000000000..9a60f346d544e975f8cf803efa0ce09a5c6e2a12 --- /dev/null +++ b/pipecat-transport-openai-realtime-webrtc_indexhtml_bf6a944e.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/client/android/pipecat-transport-openai-realtime-webrtc/index.html#next-steps +Title: Overview - Pipecat +================================================== + +Overview - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Get Started Overview Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Get Started Overview Installation & Setup Quickstart Core Concepts Next Steps & Examples Pipecat is an open source Python framework that handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions. “Multimodal” means you can use any combination of audio, video, images, and/or text in your interactions. And “real-time” means that things are happening quickly enough that it feels conversational—a “back-and-forth” with a bot, not submitting a query and waiting for results. ​ What You Can Build Voice Assistants Natural, real-time conversations with AI using speech recognition and synthesis Interactive Agents Personal coaches and meeting assistants that can understand context and provide guidance Multimodal Apps Applications that combine voice, video, images, and text for rich interactions Creative Tools Storytelling experiences and social companions that engage users Business Solutions Customer intake flows and support bots for automated business processes Complex Flows Structured conversations using Pipecat Flows for managing complex interactions ​ How It Works The flow of interactions in a Pipecat application is typically straightforward: The bot says something The user says something The bot says something The user says something This continues until the conversation naturally ends. While this flow seems simple, making it feel natural requires sophisticated real-time processing. ​ Real-time Processing Pipecat’s pipeline architecture handles both simple voice interactions and complex multimodal processing. Let’s look at how data flows through the system: Voice app Multimodal app 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio and Video Transmit and capture audio, video, and image inputs simultaneously 2 Process Streams Handle multiple input streams in parallel 3 Model Processing Send combined inputs to multimodal models (like GPT-4V) 4 Generate Outputs Create various outputs (text, images, audio, etc.) 5 Coordinate Presentation Synchronize and present multiple output types In both cases, Pipecat: Processes responses as they stream in Handles multiple input/output modalities concurrently Manages resource allocation and synchronization Coordinates parallel processing tasks This architecture creates fluid, natural interactions without noticeable delays, whether you’re building a simple voice assistant or a complex multimodal application. Pipecat’s pipeline architecture is particularly valuable for managing the complexity of real-time, multimodal interactions, ensuring smooth data flow and proper synchronization regardless of the input/output types involved. Pipecat handles all this complexity for you, letting you focus on building your application rather than managing the underlying infrastructure. ​ Next Steps Ready to build your first Pipecat application? Installation & Setup Prepare your environment and install required dependencies Quickstart Build and run your first Pipecat application Core Concepts Learn about pipelines, frames, and real-time processing Use Cases Explore example implementations and patterns ​ Join Our Community Discord Community Connect with other developers, share your projects, and get support from the Pipecat team. Installation & Setup On this page What You Can Build How It Works Real-time Processing Next Steps Join Our Community Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipecat-transport-openai-realtime-webrtc_indexhtml_fe709da3.txt b/pipecat-transport-openai-realtime-webrtc_indexhtml_fe709da3.txt new file mode 100644 index 0000000000000000000000000000000000000000..5d99ff2be4fc64d5cdcb7ceca2c637637c9765a3 --- /dev/null +++ b/pipecat-transport-openai-realtime-webrtc_indexhtml_fe709da3.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/client/android/pipecat-transport-openai-realtime-webrtc/index.html#join-our-community +Title: Overview - Pipecat +================================================== + +Overview - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Get Started Overview Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Get Started Overview Installation & Setup Quickstart Core Concepts Next Steps & Examples Pipecat is an open source Python framework that handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions. “Multimodal” means you can use any combination of audio, video, images, and/or text in your interactions. And “real-time” means that things are happening quickly enough that it feels conversational—a “back-and-forth” with a bot, not submitting a query and waiting for results. ​ What You Can Build Voice Assistants Natural, real-time conversations with AI using speech recognition and synthesis Interactive Agents Personal coaches and meeting assistants that can understand context and provide guidance Multimodal Apps Applications that combine voice, video, images, and text for rich interactions Creative Tools Storytelling experiences and social companions that engage users Business Solutions Customer intake flows and support bots for automated business processes Complex Flows Structured conversations using Pipecat Flows for managing complex interactions ​ How It Works The flow of interactions in a Pipecat application is typically straightforward: The bot says something The user says something The bot says something The user says something This continues until the conversation naturally ends. While this flow seems simple, making it feel natural requires sophisticated real-time processing. ​ Real-time Processing Pipecat’s pipeline architecture handles both simple voice interactions and complex multimodal processing. Let’s look at how data flows through the system: Voice app Multimodal app 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio Transmit and capture streamed audio from the user 2 Transcribe Speech Convert speech to text as the user is talking 3 Process with LLM Generate responses using a large language model 4 Convert to Speech Transform text responses into natural speech 5 Play Audio Stream the audio response back to the user 1 Send Audio and Video Transmit and capture audio, video, and image inputs simultaneously 2 Process Streams Handle multiple input streams in parallel 3 Model Processing Send combined inputs to multimodal models (like GPT-4V) 4 Generate Outputs Create various outputs (text, images, audio, etc.) 5 Coordinate Presentation Synchronize and present multiple output types In both cases, Pipecat: Processes responses as they stream in Handles multiple input/output modalities concurrently Manages resource allocation and synchronization Coordinates parallel processing tasks This architecture creates fluid, natural interactions without noticeable delays, whether you’re building a simple voice assistant or a complex multimodal application. Pipecat’s pipeline architecture is particularly valuable for managing the complexity of real-time, multimodal interactions, ensuring smooth data flow and proper synchronization regardless of the input/output types involved. Pipecat handles all this complexity for you, letting you focus on building your application rather than managing the underlying infrastructure. ​ Next Steps Ready to build your first Pipecat application? Installation & Setup Prepare your environment and install required dependencies Quickstart Build and run your first Pipecat application Core Concepts Learn about pipelines, frames, and real-time processing Use Cases Explore example implementations and patterns ​ Join Our Community Discord Community Connect with other developers, share your projects, and get support from the Pipecat team. Installation & Setup On this page What You Can Build How It Works Real-time Processing Next Steps Join Our Community Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_heartbeats_549239c2.txt b/pipeline_heartbeats_549239c2.txt new file mode 100644 index 0000000000000000000000000000000000000000..785f8f288ec657485ad273b53154a53f089a6bbd --- /dev/null +++ b/pipeline_heartbeats_549239c2.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/heartbeats#configuration +Title: Pipeline Heartbeats - Pipecat +================================================== + +Pipeline Heartbeats - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline Pipeline Heartbeats Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview Pipeline heartbeats provide a way to monitor the health of your pipeline by sending periodic heartbeat frames through the system. When enabled, the pipeline will send heartbeat frames every second and monitor their progress through the pipeline. ​ Enabling Heartbeats Heartbeats can be enabled by setting enable_heartbeats to True in the PipelineParams : Copy Ask AI from pipecat.pipeline.task import PipelineParams, PipelineTask pipeline = Pipeline([ ... ]) params = params = PipelineParams( enable_heartbeats = True ) task = PipelineTask(pipeline, params) ​ How It Works When heartbeats are enabled: The pipeline sends a HeartbeatFrame every second The frame traverses through all processors in the pipeline, from source to sink The pipeline monitors how long it takes for heartbeat frames to complete their journey If a heartbeat frame isn’t received within 5 seconds, a warning is logged ​ Monitoring Output The system will log: Trace-level logs showing heartbeat processing time Warning messages if heartbeats aren’t received within the monitoring window Example warning message: Copy Ask AI WARNING PipelineTask#1: heartbeat frame not received for more than 5.0 seconds ​ Use Cases Heartbeat monitoring is useful for: Detecting pipeline stalls or blockages Monitoring processing latency through the pipeline Identifying performance issues in specific processors Ensuring the pipeline remains responsive ​ Configuration The heartbeat system uses two key timing constants: HEARTBEAT_SECONDS = 1.0 - Interval between heartbeat frames HEARTBEAT_MONITOR_SECONDS = 10.0 - Time before warning if no heartbeat received These values are currently fixed but may be configurable in future versions. Pipeline Idle Detection ParallelPipeline On this page Overview Enabling Heartbeats How It Works Monitoring Output Use Cases Configuration Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_heartbeats_d8bb0f46.txt b/pipeline_heartbeats_d8bb0f46.txt new file mode 100644 index 0000000000000000000000000000000000000000..b160a16dfc1e4572c9848781c96e51e3e4143026 --- /dev/null +++ b/pipeline_heartbeats_d8bb0f46.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/heartbeats +Title: Pipeline Heartbeats - Pipecat +================================================== + +Pipeline Heartbeats - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline Pipeline Heartbeats Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview Pipeline heartbeats provide a way to monitor the health of your pipeline by sending periodic heartbeat frames through the system. When enabled, the pipeline will send heartbeat frames every second and monitor their progress through the pipeline. ​ Enabling Heartbeats Heartbeats can be enabled by setting enable_heartbeats to True in the PipelineParams : Copy Ask AI from pipecat.pipeline.task import PipelineParams, PipelineTask pipeline = Pipeline([ ... ]) params = params = PipelineParams( enable_heartbeats = True ) task = PipelineTask(pipeline, params) ​ How It Works When heartbeats are enabled: The pipeline sends a HeartbeatFrame every second The frame traverses through all processors in the pipeline, from source to sink The pipeline monitors how long it takes for heartbeat frames to complete their journey If a heartbeat frame isn’t received within 5 seconds, a warning is logged ​ Monitoring Output The system will log: Trace-level logs showing heartbeat processing time Warning messages if heartbeats aren’t received within the monitoring window Example warning message: Copy Ask AI WARNING PipelineTask#1: heartbeat frame not received for more than 5.0 seconds ​ Use Cases Heartbeat monitoring is useful for: Detecting pipeline stalls or blockages Monitoring processing latency through the pipeline Identifying performance issues in specific processors Ensuring the pipeline remains responsive ​ Configuration The heartbeat system uses two key timing constants: HEARTBEAT_SECONDS = 1.0 - Interval between heartbeat frames HEARTBEAT_MONITOR_SECONDS = 10.0 - Time before warning if no heartbeat received These values are currently fixed but may be configurable in future versions. Pipeline Idle Detection ParallelPipeline On this page Overview Enabling Heartbeats How It Works Monitoring Output Use Cases Configuration Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_heartbeats_f35f7e52.txt b/pipeline_heartbeats_f35f7e52.txt new file mode 100644 index 0000000000000000000000000000000000000000..42eda3286175359f521a6dcfeca6feb9e0d289ae --- /dev/null +++ b/pipeline_heartbeats_f35f7e52.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/heartbeats#how-it-works +Title: Pipeline Heartbeats - Pipecat +================================================== + +Pipeline Heartbeats - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline Pipeline Heartbeats Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview Pipeline heartbeats provide a way to monitor the health of your pipeline by sending periodic heartbeat frames through the system. When enabled, the pipeline will send heartbeat frames every second and monitor their progress through the pipeline. ​ Enabling Heartbeats Heartbeats can be enabled by setting enable_heartbeats to True in the PipelineParams : Copy Ask AI from pipecat.pipeline.task import PipelineParams, PipelineTask pipeline = Pipeline([ ... ]) params = params = PipelineParams( enable_heartbeats = True ) task = PipelineTask(pipeline, params) ​ How It Works When heartbeats are enabled: The pipeline sends a HeartbeatFrame every second The frame traverses through all processors in the pipeline, from source to sink The pipeline monitors how long it takes for heartbeat frames to complete their journey If a heartbeat frame isn’t received within 5 seconds, a warning is logged ​ Monitoring Output The system will log: Trace-level logs showing heartbeat processing time Warning messages if heartbeats aren’t received within the monitoring window Example warning message: Copy Ask AI WARNING PipelineTask#1: heartbeat frame not received for more than 5.0 seconds ​ Use Cases Heartbeat monitoring is useful for: Detecting pipeline stalls or blockages Monitoring processing latency through the pipeline Identifying performance issues in specific processors Ensuring the pipeline remains responsive ​ Configuration The heartbeat system uses two key timing constants: HEARTBEAT_SECONDS = 1.0 - Interval between heartbeat frames HEARTBEAT_MONITOR_SECONDS = 10.0 - Time before warning if no heartbeat received These values are currently fixed but may be configurable in future versions. Pipeline Idle Detection ParallelPipeline On this page Overview Enabling Heartbeats How It Works Monitoring Output Use Cases Configuration Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_parallel-pipeline_4c87f111.txt b/pipeline_parallel-pipeline_4c87f111.txt new file mode 100644 index 0000000000000000000000000000000000000000..fa172af57d8f28418cc7846710a02da3a839b4c7 --- /dev/null +++ b/pipeline_parallel-pipeline_4c87f111.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/parallel-pipeline#cross-branch-communication +Title: ParallelPipeline - Pipecat +================================================== + +ParallelPipeline - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline ParallelPipeline Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview ParallelPipeline allows you to create multiple independent processing branches that run simultaneously, sharing input and coordinating output. It’s particularly useful for multi-agent systems, parallel stream processing, and creating redundant service paths. Each branch receives the same downstream frames, processes them independently, and the results are merged back into a single stream. System frames (like StartFrame and EndFrame ) are synchronized across all branches. ​ Constructor Parameters ​ *args List[List[FrameProcessor]] required Multiple lists of processors, where each list defines a parallel branch. All branches execute simultaneously when frames flow through the pipeline. ​ Usage Examples ​ Multi-Agent Conversation Create a conversation with two AI agents that can interact with the user independently: Copy Ask AI pipeline = Pipeline([ transport.input(), ParallelPipeline( # Agent 1: Customer service representative [ stt_1, context_aggregator.user_a(), llm_agent_1, tts_agent_1, ], # Agent 2: Technical specialist [ stt_2, context_aggregator.user_b(), llm_agent_2, tts_agent_2, ] ), transport.output(), ]) ​ Redundant Services with Failover Set up redundant services with automatic failover: Copy Ask AI pipeline = Pipeline([ transport.input(), stt, ParallelPipeline( # Primary LLM service [ gate_primary, primary_llm, error_detector, ], # Backup LLM service (used only if primary fails) [ gate_backup, backup_llm, fallback_processor, ] ), tts, transport.output(), ]) ​ Cross-Branch Communication Using Producer/Consumer processors to share data between branches: Copy Ask AI # Create producer/consumer pair for cross-branch communication frame_producer = ProducerProcessor( filter = is_important_frame) frame_consumer = ConsumerProcessor( producer = frame_producer) pipeline = Pipeline([ transport.input(), ParallelPipeline( # Branch that generates important frames [ stt, llm, tts, frame_producer, # Share frames with other branch ], # Branch that consumes those frames [ frame_consumer, # Receive frames from other branch llm, # Speech to Speech LLM (audio in) ] ), transport.output(), ]) ​ How It Works ParallelPipeline adds special source and sink processors to each branch System frames (like StartFrame and EndFrame ) are sent to all branches Other frames flow downstream to all branch sources Results from each branch are collected at the sinks The pipeline ensures EndFrame s are only passed through after all branches complete Pipeline Heartbeats On this page Overview Constructor Parameters Usage Examples Multi-Agent Conversation Redundant Services with Failover Cross-Branch Communication How It Works Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_parallel-pipeline_56598810.txt b/pipeline_parallel-pipeline_56598810.txt new file mode 100644 index 0000000000000000000000000000000000000000..d81f7bc4b988fd23d5c2d52b0cfc4d163318eeb1 --- /dev/null +++ b/pipeline_parallel-pipeline_56598810.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/parallel-pipeline#overview +Title: ParallelPipeline - Pipecat +================================================== + +ParallelPipeline - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline ParallelPipeline Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview ParallelPipeline allows you to create multiple independent processing branches that run simultaneously, sharing input and coordinating output. It’s particularly useful for multi-agent systems, parallel stream processing, and creating redundant service paths. Each branch receives the same downstream frames, processes them independently, and the results are merged back into a single stream. System frames (like StartFrame and EndFrame ) are synchronized across all branches. ​ Constructor Parameters ​ *args List[List[FrameProcessor]] required Multiple lists of processors, where each list defines a parallel branch. All branches execute simultaneously when frames flow through the pipeline. ​ Usage Examples ​ Multi-Agent Conversation Create a conversation with two AI agents that can interact with the user independently: Copy Ask AI pipeline = Pipeline([ transport.input(), ParallelPipeline( # Agent 1: Customer service representative [ stt_1, context_aggregator.user_a(), llm_agent_1, tts_agent_1, ], # Agent 2: Technical specialist [ stt_2, context_aggregator.user_b(), llm_agent_2, tts_agent_2, ] ), transport.output(), ]) ​ Redundant Services with Failover Set up redundant services with automatic failover: Copy Ask AI pipeline = Pipeline([ transport.input(), stt, ParallelPipeline( # Primary LLM service [ gate_primary, primary_llm, error_detector, ], # Backup LLM service (used only if primary fails) [ gate_backup, backup_llm, fallback_processor, ] ), tts, transport.output(), ]) ​ Cross-Branch Communication Using Producer/Consumer processors to share data between branches: Copy Ask AI # Create producer/consumer pair for cross-branch communication frame_producer = ProducerProcessor( filter = is_important_frame) frame_consumer = ConsumerProcessor( producer = frame_producer) pipeline = Pipeline([ transport.input(), ParallelPipeline( # Branch that generates important frames [ stt, llm, tts, frame_producer, # Share frames with other branch ], # Branch that consumes those frames [ frame_consumer, # Receive frames from other branch llm, # Speech to Speech LLM (audio in) ] ), transport.output(), ]) ​ How It Works ParallelPipeline adds special source and sink processors to each branch System frames (like StartFrame and EndFrame ) are sent to all branches Other frames flow downstream to all branch sources Results from each branch are collected at the sinks The pipeline ensures EndFrame s are only passed through after all branches complete Pipeline Heartbeats On this page Overview Constructor Parameters Usage Examples Multi-Agent Conversation Redundant Services with Failover Cross-Branch Communication How It Works Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_parallel-pipeline_57bf8e1d.txt b/pipeline_parallel-pipeline_57bf8e1d.txt new file mode 100644 index 0000000000000000000000000000000000000000..8445a55d677415e8ad9946366a5c438d2e9bdddc --- /dev/null +++ b/pipeline_parallel-pipeline_57bf8e1d.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/parallel-pipeline#constructor-parameters +Title: ParallelPipeline - Pipecat +================================================== + +ParallelPipeline - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline ParallelPipeline Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview ParallelPipeline allows you to create multiple independent processing branches that run simultaneously, sharing input and coordinating output. It’s particularly useful for multi-agent systems, parallel stream processing, and creating redundant service paths. Each branch receives the same downstream frames, processes them independently, and the results are merged back into a single stream. System frames (like StartFrame and EndFrame ) are synchronized across all branches. ​ Constructor Parameters ​ *args List[List[FrameProcessor]] required Multiple lists of processors, where each list defines a parallel branch. All branches execute simultaneously when frames flow through the pipeline. ​ Usage Examples ​ Multi-Agent Conversation Create a conversation with two AI agents that can interact with the user independently: Copy Ask AI pipeline = Pipeline([ transport.input(), ParallelPipeline( # Agent 1: Customer service representative [ stt_1, context_aggregator.user_a(), llm_agent_1, tts_agent_1, ], # Agent 2: Technical specialist [ stt_2, context_aggregator.user_b(), llm_agent_2, tts_agent_2, ] ), transport.output(), ]) ​ Redundant Services with Failover Set up redundant services with automatic failover: Copy Ask AI pipeline = Pipeline([ transport.input(), stt, ParallelPipeline( # Primary LLM service [ gate_primary, primary_llm, error_detector, ], # Backup LLM service (used only if primary fails) [ gate_backup, backup_llm, fallback_processor, ] ), tts, transport.output(), ]) ​ Cross-Branch Communication Using Producer/Consumer processors to share data between branches: Copy Ask AI # Create producer/consumer pair for cross-branch communication frame_producer = ProducerProcessor( filter = is_important_frame) frame_consumer = ConsumerProcessor( producer = frame_producer) pipeline = Pipeline([ transport.input(), ParallelPipeline( # Branch that generates important frames [ stt, llm, tts, frame_producer, # Share frames with other branch ], # Branch that consumes those frames [ frame_consumer, # Receive frames from other branch llm, # Speech to Speech LLM (audio in) ] ), transport.output(), ]) ​ How It Works ParallelPipeline adds special source and sink processors to each branch System frames (like StartFrame and EndFrame ) are sent to all branches Other frames flow downstream to all branch sources Results from each branch are collected at the sinks The pipeline ensures EndFrame s are only passed through after all branches complete Pipeline Heartbeats On this page Overview Constructor Parameters Usage Examples Multi-Agent Conversation Redundant Services with Failover Cross-Branch Communication How It Works Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_parallel-pipeline_9aee58a2.txt b/pipeline_parallel-pipeline_9aee58a2.txt new file mode 100644 index 0000000000000000000000000000000000000000..e506ae6418279d006cd1cfb467fad30146738c82 --- /dev/null +++ b/pipeline_parallel-pipeline_9aee58a2.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/parallel-pipeline +Title: ParallelPipeline - Pipecat +================================================== + +ParallelPipeline - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline ParallelPipeline Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview ParallelPipeline allows you to create multiple independent processing branches that run simultaneously, sharing input and coordinating output. It’s particularly useful for multi-agent systems, parallel stream processing, and creating redundant service paths. Each branch receives the same downstream frames, processes them independently, and the results are merged back into a single stream. System frames (like StartFrame and EndFrame ) are synchronized across all branches. ​ Constructor Parameters ​ *args List[List[FrameProcessor]] required Multiple lists of processors, where each list defines a parallel branch. All branches execute simultaneously when frames flow through the pipeline. ​ Usage Examples ​ Multi-Agent Conversation Create a conversation with two AI agents that can interact with the user independently: Copy Ask AI pipeline = Pipeline([ transport.input(), ParallelPipeline( # Agent 1: Customer service representative [ stt_1, context_aggregator.user_a(), llm_agent_1, tts_agent_1, ], # Agent 2: Technical specialist [ stt_2, context_aggregator.user_b(), llm_agent_2, tts_agent_2, ] ), transport.output(), ]) ​ Redundant Services with Failover Set up redundant services with automatic failover: Copy Ask AI pipeline = Pipeline([ transport.input(), stt, ParallelPipeline( # Primary LLM service [ gate_primary, primary_llm, error_detector, ], # Backup LLM service (used only if primary fails) [ gate_backup, backup_llm, fallback_processor, ] ), tts, transport.output(), ]) ​ Cross-Branch Communication Using Producer/Consumer processors to share data between branches: Copy Ask AI # Create producer/consumer pair for cross-branch communication frame_producer = ProducerProcessor( filter = is_important_frame) frame_consumer = ConsumerProcessor( producer = frame_producer) pipeline = Pipeline([ transport.input(), ParallelPipeline( # Branch that generates important frames [ stt, llm, tts, frame_producer, # Share frames with other branch ], # Branch that consumes those frames [ frame_consumer, # Receive frames from other branch llm, # Speech to Speech LLM (audio in) ] ), transport.output(), ]) ​ How It Works ParallelPipeline adds special source and sink processors to each branch System frames (like StartFrame and EndFrame ) are sent to all branches Other frames flow downstream to all branch sources Results from each branch are collected at the sinks The pipeline ensures EndFrame s are only passed through after all branches complete Pipeline Heartbeats On this page Overview Constructor Parameters Usage Examples Multi-Agent Conversation Redundant Services with Failover Cross-Branch Communication How It Works Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-idle-detection_3cfb3ba8.txt b/pipeline_pipeline-idle-detection_3cfb3ba8.txt new file mode 100644 index 0000000000000000000000000000000000000000..ea59e62760cb6cb20f7b3057ffd98a53b284691c --- /dev/null +++ b/pipeline_pipeline-idle-detection_3cfb3ba8.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-idle-detection +Title: Pipeline Idle Detection - Pipecat +================================================== + +Pipeline Idle Detection - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline Pipeline Idle Detection Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview Pipeline idle detection is a feature that monitors activity in your pipeline and can automatically cancel tasks when no meaningful bot interactions are occurring. This helps prevent pipelines from running indefinitely when a conversation has naturally ended but wasn’t properly terminated. ​ How It Works The system monitors specific “activity frames” that indicate the bot is actively engaged in the conversation. By default, these are: BotSpeakingFrame - When the bot is speaking LLMFullResponseEndFrame - When the LLM has completed a response If no activity frames are detected within the configured timeout period (5 minutes by default), the system considers the pipeline idle and can automatically terminate it. Idle detection only starts after the pipeline has begun processing frames. The idle timer resets whenever an activity frame (as specified in idle_timeout_frames ) is received. ​ Configuration You can configure idle detection behavior when creating a PipelineTask : Copy Ask AI from pipecat.pipeline.task import PipelineParams, PipelineTask # Default configuration - cancel after 5 minutes of inactivity task = PipelineTask(pipeline) # Custom configuration task = PipelineTask( pipeline, params = PipelineParams( allow_interruptions = True ), idle_timeout_secs = 600 , # 10 minute timeout idle_timeout_frames = (BotSpeakingFrame,), # Only monitor bot speaking cancel_on_idle_timeout = False , # Don't auto-cancel, just notify ) ​ Configuration Parameters ​ idle_timeout_secs Optional[float] default: "300" Timeout in seconds before considering the pipeline idle. Set to None to disable idle detection. ​ idle_timeout_frames Tuple[Type[Frame], ...] default: "(BotSpeakingFrame, LLMFullResponseEndFrame)" Frame types that should prevent the pipeline from being considered idle. ​ cancel_on_idle_timeout bool default: "True" Whether to automatically cancel the pipeline task when idle timeout is reached. ​ Handling Idle Timeouts You can respond to idle timeout events by adding an event handler: Copy Ask AI @task.event_handler ( "on_idle_timeout" ) async def on_idle_timeout ( task ): logger.info( "Pipeline has been idle for too long" ) # Perform any custom cleanup or logging # Note: If cancel_on_idle_timeout=True, the pipeline will be cancelled after this handler runs ​ Example Implementation Here’s a complete example showing how to configure idle detection with custom handling: Copy Ask AI from pipecat.frames.frames import BotSpeakingFrame, LLMFullResponseEndFrame, TTSSpeakFrame from pipecat.pipeline.runner import PipelineRunner from pipecat.pipeline.task import PipelineParams, PipelineTask # Create pipeline pipeline = Pipeline([ ... ]) # Configure task with custom idle settings task = PipelineTask( pipeline, params = PipelineParams( allow_interruptions = True ), idle_timeout_secs = 180 , # 3 minutes cancel_on_idle_timeout = False # Don't auto-cancel ) # Add event handler for idle timeout @task.event_handler ( "on_idle_timeout" ) async def on_idle_timeout ( task ): logger.info( "Conversation has been idle for 3 minutes" ) # Add a farewell message await task.queue_frame(TTSSpeakFrame( "I haven't heard from you in a while. Goodbye!" )) # Then end the conversation gracefully await task.stop_when_done() runner = PipelineRunner() await runner.run(task) PipelineTask Pipeline Heartbeats On this page Overview How It Works Configuration Configuration Parameters Handling Idle Timeouts Example Implementation Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-idle-detection_9221d7b0.txt b/pipeline_pipeline-idle-detection_9221d7b0.txt new file mode 100644 index 0000000000000000000000000000000000000000..b959df24dee9f3b60737431acf0c8a50b00048ad --- /dev/null +++ b/pipeline_pipeline-idle-detection_9221d7b0.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-idle-detection#configuration +Title: Pipeline Idle Detection - Pipecat +================================================== + +Pipeline Idle Detection - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline Pipeline Idle Detection Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview Pipeline idle detection is a feature that monitors activity in your pipeline and can automatically cancel tasks when no meaningful bot interactions are occurring. This helps prevent pipelines from running indefinitely when a conversation has naturally ended but wasn’t properly terminated. ​ How It Works The system monitors specific “activity frames” that indicate the bot is actively engaged in the conversation. By default, these are: BotSpeakingFrame - When the bot is speaking LLMFullResponseEndFrame - When the LLM has completed a response If no activity frames are detected within the configured timeout period (5 minutes by default), the system considers the pipeline idle and can automatically terminate it. Idle detection only starts after the pipeline has begun processing frames. The idle timer resets whenever an activity frame (as specified in idle_timeout_frames ) is received. ​ Configuration You can configure idle detection behavior when creating a PipelineTask : Copy Ask AI from pipecat.pipeline.task import PipelineParams, PipelineTask # Default configuration - cancel after 5 minutes of inactivity task = PipelineTask(pipeline) # Custom configuration task = PipelineTask( pipeline, params = PipelineParams( allow_interruptions = True ), idle_timeout_secs = 600 , # 10 minute timeout idle_timeout_frames = (BotSpeakingFrame,), # Only monitor bot speaking cancel_on_idle_timeout = False , # Don't auto-cancel, just notify ) ​ Configuration Parameters ​ idle_timeout_secs Optional[float] default: "300" Timeout in seconds before considering the pipeline idle. Set to None to disable idle detection. ​ idle_timeout_frames Tuple[Type[Frame], ...] default: "(BotSpeakingFrame, LLMFullResponseEndFrame)" Frame types that should prevent the pipeline from being considered idle. ​ cancel_on_idle_timeout bool default: "True" Whether to automatically cancel the pipeline task when idle timeout is reached. ​ Handling Idle Timeouts You can respond to idle timeout events by adding an event handler: Copy Ask AI @task.event_handler ( "on_idle_timeout" ) async def on_idle_timeout ( task ): logger.info( "Pipeline has been idle for too long" ) # Perform any custom cleanup or logging # Note: If cancel_on_idle_timeout=True, the pipeline will be cancelled after this handler runs ​ Example Implementation Here’s a complete example showing how to configure idle detection with custom handling: Copy Ask AI from pipecat.frames.frames import BotSpeakingFrame, LLMFullResponseEndFrame, TTSSpeakFrame from pipecat.pipeline.runner import PipelineRunner from pipecat.pipeline.task import PipelineParams, PipelineTask # Create pipeline pipeline = Pipeline([ ... ]) # Configure task with custom idle settings task = PipelineTask( pipeline, params = PipelineParams( allow_interruptions = True ), idle_timeout_secs = 180 , # 3 minutes cancel_on_idle_timeout = False # Don't auto-cancel ) # Add event handler for idle timeout @task.event_handler ( "on_idle_timeout" ) async def on_idle_timeout ( task ): logger.info( "Conversation has been idle for 3 minutes" ) # Add a farewell message await task.queue_frame(TTSSpeakFrame( "I haven't heard from you in a while. Goodbye!" )) # Then end the conversation gracefully await task.stop_when_done() runner = PipelineRunner() await runner.run(task) PipelineTask Pipeline Heartbeats On this page Overview How It Works Configuration Configuration Parameters Handling Idle Timeouts Example Implementation Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-idle-detection_9f19f9fc.txt b/pipeline_pipeline-idle-detection_9f19f9fc.txt new file mode 100644 index 0000000000000000000000000000000000000000..2455ece1efc09b590662dfbd6b48ac1b42ad3921 --- /dev/null +++ b/pipeline_pipeline-idle-detection_9f19f9fc.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-idle-detection#how-it-works +Title: Pipeline Idle Detection - Pipecat +================================================== + +Pipeline Idle Detection - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline Pipeline Idle Detection Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview Pipeline idle detection is a feature that monitors activity in your pipeline and can automatically cancel tasks when no meaningful bot interactions are occurring. This helps prevent pipelines from running indefinitely when a conversation has naturally ended but wasn’t properly terminated. ​ How It Works The system monitors specific “activity frames” that indicate the bot is actively engaged in the conversation. By default, these are: BotSpeakingFrame - When the bot is speaking LLMFullResponseEndFrame - When the LLM has completed a response If no activity frames are detected within the configured timeout period (5 minutes by default), the system considers the pipeline idle and can automatically terminate it. Idle detection only starts after the pipeline has begun processing frames. The idle timer resets whenever an activity frame (as specified in idle_timeout_frames ) is received. ​ Configuration You can configure idle detection behavior when creating a PipelineTask : Copy Ask AI from pipecat.pipeline.task import PipelineParams, PipelineTask # Default configuration - cancel after 5 minutes of inactivity task = PipelineTask(pipeline) # Custom configuration task = PipelineTask( pipeline, params = PipelineParams( allow_interruptions = True ), idle_timeout_secs = 600 , # 10 minute timeout idle_timeout_frames = (BotSpeakingFrame,), # Only monitor bot speaking cancel_on_idle_timeout = False , # Don't auto-cancel, just notify ) ​ Configuration Parameters ​ idle_timeout_secs Optional[float] default: "300" Timeout in seconds before considering the pipeline idle. Set to None to disable idle detection. ​ idle_timeout_frames Tuple[Type[Frame], ...] default: "(BotSpeakingFrame, LLMFullResponseEndFrame)" Frame types that should prevent the pipeline from being considered idle. ​ cancel_on_idle_timeout bool default: "True" Whether to automatically cancel the pipeline task when idle timeout is reached. ​ Handling Idle Timeouts You can respond to idle timeout events by adding an event handler: Copy Ask AI @task.event_handler ( "on_idle_timeout" ) async def on_idle_timeout ( task ): logger.info( "Pipeline has been idle for too long" ) # Perform any custom cleanup or logging # Note: If cancel_on_idle_timeout=True, the pipeline will be cancelled after this handler runs ​ Example Implementation Here’s a complete example showing how to configure idle detection with custom handling: Copy Ask AI from pipecat.frames.frames import BotSpeakingFrame, LLMFullResponseEndFrame, TTSSpeakFrame from pipecat.pipeline.runner import PipelineRunner from pipecat.pipeline.task import PipelineParams, PipelineTask # Create pipeline pipeline = Pipeline([ ... ]) # Configure task with custom idle settings task = PipelineTask( pipeline, params = PipelineParams( allow_interruptions = True ), idle_timeout_secs = 180 , # 3 minutes cancel_on_idle_timeout = False # Don't auto-cancel ) # Add event handler for idle timeout @task.event_handler ( "on_idle_timeout" ) async def on_idle_timeout ( task ): logger.info( "Conversation has been idle for 3 minutes" ) # Add a farewell message await task.queue_frame(TTSSpeakFrame( "I haven't heard from you in a while. Goodbye!" )) # Then end the conversation gracefully await task.stop_when_done() runner = PipelineRunner() await runner.run(task) PipelineTask Pipeline Heartbeats On this page Overview How It Works Configuration Configuration Parameters Handling Idle Timeouts Example Implementation Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-params_41fd74e2.txt b/pipeline_pipeline-params_41fd74e2.txt new file mode 100644 index 0000000000000000000000000000000000000000..e3c1ba0978be6977dc672348a1f2f4feb33ff193 --- /dev/null +++ b/pipeline_pipeline-params_41fd74e2.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-params#how-parameters-are-used +Title: PipelineParams - Pipecat +================================================== + +PipelineParams - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline PipelineParams Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview The PipelineParams class provides a structured way to configure various aspects of pipeline execution. These parameters control behaviors like audio settings, metrics collection, heartbeat monitoring, and interruption handling. ​ Basic Usage Copy Ask AI from pipecat.pipeline.task import PipelineParams, PipelineTask # Create with default parameters params = PipelineParams() # Or customize specific parameters params = PipelineParams( allow_interruptions = True , audio_in_sample_rate = 16000 , enable_metrics = True ) # Pass to PipelineTask pipeline = Pipeline([ ... ]) task = PipelineTask(pipeline, params = params) ​ Available Parameters ​ allow_interruptions bool default: "False" Whether to allow pipeline interruptions. When enabled, a user’s speech will immediately interrupt the bot’s response. ​ audio_in_sample_rate int default: "16000" Input audio sample rate in Hz. Setting the audio_in_sample_rate as a PipelineParam sets the input sample rate for all corresponding services in the pipeline. ​ audio_out_sample_rate int default: "24000" Output audio sample rate in Hz. Setting the audio_out_sample_rate as a PipelineParam sets the output sample rate for all corresponding services in the pipeline. ​ enable_heartbeats bool default: "False" Whether to enable heartbeat monitoring to detect pipeline stalls. See Heartbeats for details. ​ heartbeats_period_secs float default: "1.0" Period between heartbeats in seconds (when heartbeats are enabled). ​ enable_metrics bool default: "False" Whether to enable metrics collection for pipeline performance. ​ enable_usage_metrics bool default: "False" Whether to enable usage metrics tracking. ​ report_only_initial_ttfb bool default: "False" Whether to report only initial time to first byte metric. ​ send_initial_empty_metrics bool default: "True" Whether to send initial empty metrics frame at pipeline start. ​ start_metadata Dict[str, Any] default: "{}" Additional metadata to include in the StartFrame. ​ Common Configurations ​ Audio Processing Configuration You can set the audio input and output sample rates in the PipelineParams to set the sample rate for all input and output services in the pipeline. This acts as a convenience to avoid setting the sample rate for each service individually. Note, if services are set individually, they will supersede the values set in PipelineParams . Copy Ask AI params = PipelineParams( audio_in_sample_rate = 8000 , # Lower quality input audio audio_out_sample_rate = 8000 # High quality output audio ) ​ Performance Monitoring Configuration Pipeline heartbeats provide a way to monitor the health of your pipeline by sending periodic heartbeat frames through the system. When enabled, the pipeline will send heartbeat frames every second and monitor their progress through the pipeline. Copy Ask AI params = PipelineParams( enable_heartbeats = True , heartbeats_period_secs = 2.0 , # Send heartbeats every 2 seconds enable_metrics = True ) ​ How Parameters Are Used The parameters you set in PipelineParams are passed to various components of the pipeline: StartFrame : Many parameters are included in the StartFrame that initializes the pipeline Metrics Collection : Metrics settings configure what performance data is gathered Heartbeat Monitoring : Controls the pipeline’s health monitoring system Audio Processing : Sample rates affect how audio is processed throughout the pipeline ​ Complete Example Copy Ask AI from pipecat.frames.frames import TTSSpeakFrame from pipecat.observers.file_observer import FileObserver from pipecat.pipeline.task import PipelineParams, PipelineTask from pipecat.pipeline.runner import PipelineRunner # Create comprehensive parameters params = PipelineParams( allow_interruptions = True , audio_in_sample_rate = 8000 , audio_out_sample_rate = 8000 , enable_heartbeats = True , enable_metrics = True , enable_usage_metrics = True , heartbeats_period_secs = 1.0 , report_only_initial_ttfb = False , start_metadata = { "conversation_id" : "conv-123" , "session_data" : { "user_id" : "user-456" , "start_time" : "2023-10-25T14:30:00Z" } } ) # Create pipeline and task pipeline = Pipeline([ ... ]) task = PipelineTask( pipeline, params = params, observers = [FileObserver( "pipeline_logs.jsonl" )] ) # Run the pipeline runner = PipelineRunner() await runner.run(task) ​ Additional Information Parameters are immutable once the pipeline starts The start_metadata dictionary can contain any serializable data For metrics collection to work properly, enable_metrics must be set to True Pipecat Flows PipelineTask On this page Overview Basic Usage Available Parameters Common Configurations Audio Processing Configuration Performance Monitoring Configuration How Parameters Are Used Complete Example Additional Information Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-params_48ef8fa3.txt b/pipeline_pipeline-params_48ef8fa3.txt new file mode 100644 index 0000000000000000000000000000000000000000..53174cecad108af5464c06bef9e2550a9f348636 --- /dev/null +++ b/pipeline_pipeline-params_48ef8fa3.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-params#param-enable-usage-metrics +Title: PipelineParams - Pipecat +================================================== + +PipelineParams - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline PipelineParams Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview The PipelineParams class provides a structured way to configure various aspects of pipeline execution. These parameters control behaviors like audio settings, metrics collection, heartbeat monitoring, and interruption handling. ​ Basic Usage Copy Ask AI from pipecat.pipeline.task import PipelineParams, PipelineTask # Create with default parameters params = PipelineParams() # Or customize specific parameters params = PipelineParams( allow_interruptions = True , audio_in_sample_rate = 16000 , enable_metrics = True ) # Pass to PipelineTask pipeline = Pipeline([ ... ]) task = PipelineTask(pipeline, params = params) ​ Available Parameters ​ allow_interruptions bool default: "False" Whether to allow pipeline interruptions. When enabled, a user’s speech will immediately interrupt the bot’s response. ​ audio_in_sample_rate int default: "16000" Input audio sample rate in Hz. Setting the audio_in_sample_rate as a PipelineParam sets the input sample rate for all corresponding services in the pipeline. ​ audio_out_sample_rate int default: "24000" Output audio sample rate in Hz. Setting the audio_out_sample_rate as a PipelineParam sets the output sample rate for all corresponding services in the pipeline. ​ enable_heartbeats bool default: "False" Whether to enable heartbeat monitoring to detect pipeline stalls. See Heartbeats for details. ​ heartbeats_period_secs float default: "1.0" Period between heartbeats in seconds (when heartbeats are enabled). ​ enable_metrics bool default: "False" Whether to enable metrics collection for pipeline performance. ​ enable_usage_metrics bool default: "False" Whether to enable usage metrics tracking. ​ report_only_initial_ttfb bool default: "False" Whether to report only initial time to first byte metric. ​ send_initial_empty_metrics bool default: "True" Whether to send initial empty metrics frame at pipeline start. ​ start_metadata Dict[str, Any] default: "{}" Additional metadata to include in the StartFrame. ​ Common Configurations ​ Audio Processing Configuration You can set the audio input and output sample rates in the PipelineParams to set the sample rate for all input and output services in the pipeline. This acts as a convenience to avoid setting the sample rate for each service individually. Note, if services are set individually, they will supersede the values set in PipelineParams . Copy Ask AI params = PipelineParams( audio_in_sample_rate = 8000 , # Lower quality input audio audio_out_sample_rate = 8000 # High quality output audio ) ​ Performance Monitoring Configuration Pipeline heartbeats provide a way to monitor the health of your pipeline by sending periodic heartbeat frames through the system. When enabled, the pipeline will send heartbeat frames every second and monitor their progress through the pipeline. Copy Ask AI params = PipelineParams( enable_heartbeats = True , heartbeats_period_secs = 2.0 , # Send heartbeats every 2 seconds enable_metrics = True ) ​ How Parameters Are Used The parameters you set in PipelineParams are passed to various components of the pipeline: StartFrame : Many parameters are included in the StartFrame that initializes the pipeline Metrics Collection : Metrics settings configure what performance data is gathered Heartbeat Monitoring : Controls the pipeline’s health monitoring system Audio Processing : Sample rates affect how audio is processed throughout the pipeline ​ Complete Example Copy Ask AI from pipecat.frames.frames import TTSSpeakFrame from pipecat.observers.file_observer import FileObserver from pipecat.pipeline.task import PipelineParams, PipelineTask from pipecat.pipeline.runner import PipelineRunner # Create comprehensive parameters params = PipelineParams( allow_interruptions = True , audio_in_sample_rate = 8000 , audio_out_sample_rate = 8000 , enable_heartbeats = True , enable_metrics = True , enable_usage_metrics = True , heartbeats_period_secs = 1.0 , report_only_initial_ttfb = False , start_metadata = { "conversation_id" : "conv-123" , "session_data" : { "user_id" : "user-456" , "start_time" : "2023-10-25T14:30:00Z" } } ) # Create pipeline and task pipeline = Pipeline([ ... ]) task = PipelineTask( pipeline, params = params, observers = [FileObserver( "pipeline_logs.jsonl" )] ) # Run the pipeline runner = PipelineRunner() await runner.run(task) ​ Additional Information Parameters are immutable once the pipeline starts The start_metadata dictionary can contain any serializable data For metrics collection to work properly, enable_metrics must be set to True Pipecat Flows PipelineTask On this page Overview Basic Usage Available Parameters Common Configurations Audio Processing Configuration Performance Monitoring Configuration How Parameters Are Used Complete Example Additional Information Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-params_797979bd.txt b/pipeline_pipeline-params_797979bd.txt new file mode 100644 index 0000000000000000000000000000000000000000..b9387c950b44c856d88fde80810619ef38662885 --- /dev/null +++ b/pipeline_pipeline-params_797979bd.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-params#common-configurations +Title: PipelineParams - Pipecat +================================================== + +PipelineParams - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline PipelineParams Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview The PipelineParams class provides a structured way to configure various aspects of pipeline execution. These parameters control behaviors like audio settings, metrics collection, heartbeat monitoring, and interruption handling. ​ Basic Usage Copy Ask AI from pipecat.pipeline.task import PipelineParams, PipelineTask # Create with default parameters params = PipelineParams() # Or customize specific parameters params = PipelineParams( allow_interruptions = True , audio_in_sample_rate = 16000 , enable_metrics = True ) # Pass to PipelineTask pipeline = Pipeline([ ... ]) task = PipelineTask(pipeline, params = params) ​ Available Parameters ​ allow_interruptions bool default: "False" Whether to allow pipeline interruptions. When enabled, a user’s speech will immediately interrupt the bot’s response. ​ audio_in_sample_rate int default: "16000" Input audio sample rate in Hz. Setting the audio_in_sample_rate as a PipelineParam sets the input sample rate for all corresponding services in the pipeline. ​ audio_out_sample_rate int default: "24000" Output audio sample rate in Hz. Setting the audio_out_sample_rate as a PipelineParam sets the output sample rate for all corresponding services in the pipeline. ​ enable_heartbeats bool default: "False" Whether to enable heartbeat monitoring to detect pipeline stalls. See Heartbeats for details. ​ heartbeats_period_secs float default: "1.0" Period between heartbeats in seconds (when heartbeats are enabled). ​ enable_metrics bool default: "False" Whether to enable metrics collection for pipeline performance. ​ enable_usage_metrics bool default: "False" Whether to enable usage metrics tracking. ​ report_only_initial_ttfb bool default: "False" Whether to report only initial time to first byte metric. ​ send_initial_empty_metrics bool default: "True" Whether to send initial empty metrics frame at pipeline start. ​ start_metadata Dict[str, Any] default: "{}" Additional metadata to include in the StartFrame. ​ Common Configurations ​ Audio Processing Configuration You can set the audio input and output sample rates in the PipelineParams to set the sample rate for all input and output services in the pipeline. This acts as a convenience to avoid setting the sample rate for each service individually. Note, if services are set individually, they will supersede the values set in PipelineParams . Copy Ask AI params = PipelineParams( audio_in_sample_rate = 8000 , # Lower quality input audio audio_out_sample_rate = 8000 # High quality output audio ) ​ Performance Monitoring Configuration Pipeline heartbeats provide a way to monitor the health of your pipeline by sending periodic heartbeat frames through the system. When enabled, the pipeline will send heartbeat frames every second and monitor their progress through the pipeline. Copy Ask AI params = PipelineParams( enable_heartbeats = True , heartbeats_period_secs = 2.0 , # Send heartbeats every 2 seconds enable_metrics = True ) ​ How Parameters Are Used The parameters you set in PipelineParams are passed to various components of the pipeline: StartFrame : Many parameters are included in the StartFrame that initializes the pipeline Metrics Collection : Metrics settings configure what performance data is gathered Heartbeat Monitoring : Controls the pipeline’s health monitoring system Audio Processing : Sample rates affect how audio is processed throughout the pipeline ​ Complete Example Copy Ask AI from pipecat.frames.frames import TTSSpeakFrame from pipecat.observers.file_observer import FileObserver from pipecat.pipeline.task import PipelineParams, PipelineTask from pipecat.pipeline.runner import PipelineRunner # Create comprehensive parameters params = PipelineParams( allow_interruptions = True , audio_in_sample_rate = 8000 , audio_out_sample_rate = 8000 , enable_heartbeats = True , enable_metrics = True , enable_usage_metrics = True , heartbeats_period_secs = 1.0 , report_only_initial_ttfb = False , start_metadata = { "conversation_id" : "conv-123" , "session_data" : { "user_id" : "user-456" , "start_time" : "2023-10-25T14:30:00Z" } } ) # Create pipeline and task pipeline = Pipeline([ ... ]) task = PipelineTask( pipeline, params = params, observers = [FileObserver( "pipeline_logs.jsonl" )] ) # Run the pipeline runner = PipelineRunner() await runner.run(task) ​ Additional Information Parameters are immutable once the pipeline starts The start_metadata dictionary can contain any serializable data For metrics collection to work properly, enable_metrics must be set to True Pipecat Flows PipelineTask On this page Overview Basic Usage Available Parameters Common Configurations Audio Processing Configuration Performance Monitoring Configuration How Parameters Are Used Complete Example Additional Information Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-params_7e321bb5.txt b/pipeline_pipeline-params_7e321bb5.txt new file mode 100644 index 0000000000000000000000000000000000000000..7f3f056724ee4008795c1d789d032ec6d27d72dc --- /dev/null +++ b/pipeline_pipeline-params_7e321bb5.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-params#param-report-only-initial-ttfb +Title: PipelineParams - Pipecat +================================================== + +PipelineParams - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline PipelineParams Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview The PipelineParams class provides a structured way to configure various aspects of pipeline execution. These parameters control behaviors like audio settings, metrics collection, heartbeat monitoring, and interruption handling. ​ Basic Usage Copy Ask AI from pipecat.pipeline.task import PipelineParams, PipelineTask # Create with default parameters params = PipelineParams() # Or customize specific parameters params = PipelineParams( allow_interruptions = True , audio_in_sample_rate = 16000 , enable_metrics = True ) # Pass to PipelineTask pipeline = Pipeline([ ... ]) task = PipelineTask(pipeline, params = params) ​ Available Parameters ​ allow_interruptions bool default: "False" Whether to allow pipeline interruptions. When enabled, a user’s speech will immediately interrupt the bot’s response. ​ audio_in_sample_rate int default: "16000" Input audio sample rate in Hz. Setting the audio_in_sample_rate as a PipelineParam sets the input sample rate for all corresponding services in the pipeline. ​ audio_out_sample_rate int default: "24000" Output audio sample rate in Hz. Setting the audio_out_sample_rate as a PipelineParam sets the output sample rate for all corresponding services in the pipeline. ​ enable_heartbeats bool default: "False" Whether to enable heartbeat monitoring to detect pipeline stalls. See Heartbeats for details. ​ heartbeats_period_secs float default: "1.0" Period between heartbeats in seconds (when heartbeats are enabled). ​ enable_metrics bool default: "False" Whether to enable metrics collection for pipeline performance. ​ enable_usage_metrics bool default: "False" Whether to enable usage metrics tracking. ​ report_only_initial_ttfb bool default: "False" Whether to report only initial time to first byte metric. ​ send_initial_empty_metrics bool default: "True" Whether to send initial empty metrics frame at pipeline start. ​ start_metadata Dict[str, Any] default: "{}" Additional metadata to include in the StartFrame. ​ Common Configurations ​ Audio Processing Configuration You can set the audio input and output sample rates in the PipelineParams to set the sample rate for all input and output services in the pipeline. This acts as a convenience to avoid setting the sample rate for each service individually. Note, if services are set individually, they will supersede the values set in PipelineParams . Copy Ask AI params = PipelineParams( audio_in_sample_rate = 8000 , # Lower quality input audio audio_out_sample_rate = 8000 # High quality output audio ) ​ Performance Monitoring Configuration Pipeline heartbeats provide a way to monitor the health of your pipeline by sending periodic heartbeat frames through the system. When enabled, the pipeline will send heartbeat frames every second and monitor their progress through the pipeline. Copy Ask AI params = PipelineParams( enable_heartbeats = True , heartbeats_period_secs = 2.0 , # Send heartbeats every 2 seconds enable_metrics = True ) ​ How Parameters Are Used The parameters you set in PipelineParams are passed to various components of the pipeline: StartFrame : Many parameters are included in the StartFrame that initializes the pipeline Metrics Collection : Metrics settings configure what performance data is gathered Heartbeat Monitoring : Controls the pipeline’s health monitoring system Audio Processing : Sample rates affect how audio is processed throughout the pipeline ​ Complete Example Copy Ask AI from pipecat.frames.frames import TTSSpeakFrame from pipecat.observers.file_observer import FileObserver from pipecat.pipeline.task import PipelineParams, PipelineTask from pipecat.pipeline.runner import PipelineRunner # Create comprehensive parameters params = PipelineParams( allow_interruptions = True , audio_in_sample_rate = 8000 , audio_out_sample_rate = 8000 , enable_heartbeats = True , enable_metrics = True , enable_usage_metrics = True , heartbeats_period_secs = 1.0 , report_only_initial_ttfb = False , start_metadata = { "conversation_id" : "conv-123" , "session_data" : { "user_id" : "user-456" , "start_time" : "2023-10-25T14:30:00Z" } } ) # Create pipeline and task pipeline = Pipeline([ ... ]) task = PipelineTask( pipeline, params = params, observers = [FileObserver( "pipeline_logs.jsonl" )] ) # Run the pipeline runner = PipelineRunner() await runner.run(task) ​ Additional Information Parameters are immutable once the pipeline starts The start_metadata dictionary can contain any serializable data For metrics collection to work properly, enable_metrics must be set to True Pipecat Flows PipelineTask On this page Overview Basic Usage Available Parameters Common Configurations Audio Processing Configuration Performance Monitoring Configuration How Parameters Are Used Complete Example Additional Information Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-params_c0e07e7f.txt b/pipeline_pipeline-params_c0e07e7f.txt new file mode 100644 index 0000000000000000000000000000000000000000..26574c8a21dd7fca0f074efa0004f2fcdac5a10e --- /dev/null +++ b/pipeline_pipeline-params_c0e07e7f.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-params#param-heartbeats-period-secs +Title: PipelineParams - Pipecat +================================================== + +PipelineParams - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline PipelineParams Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview The PipelineParams class provides a structured way to configure various aspects of pipeline execution. These parameters control behaviors like audio settings, metrics collection, heartbeat monitoring, and interruption handling. ​ Basic Usage Copy Ask AI from pipecat.pipeline.task import PipelineParams, PipelineTask # Create with default parameters params = PipelineParams() # Or customize specific parameters params = PipelineParams( allow_interruptions = True , audio_in_sample_rate = 16000 , enable_metrics = True ) # Pass to PipelineTask pipeline = Pipeline([ ... ]) task = PipelineTask(pipeline, params = params) ​ Available Parameters ​ allow_interruptions bool default: "False" Whether to allow pipeline interruptions. When enabled, a user’s speech will immediately interrupt the bot’s response. ​ audio_in_sample_rate int default: "16000" Input audio sample rate in Hz. Setting the audio_in_sample_rate as a PipelineParam sets the input sample rate for all corresponding services in the pipeline. ​ audio_out_sample_rate int default: "24000" Output audio sample rate in Hz. Setting the audio_out_sample_rate as a PipelineParam sets the output sample rate for all corresponding services in the pipeline. ​ enable_heartbeats bool default: "False" Whether to enable heartbeat monitoring to detect pipeline stalls. See Heartbeats for details. ​ heartbeats_period_secs float default: "1.0" Period between heartbeats in seconds (when heartbeats are enabled). ​ enable_metrics bool default: "False" Whether to enable metrics collection for pipeline performance. ​ enable_usage_metrics bool default: "False" Whether to enable usage metrics tracking. ​ report_only_initial_ttfb bool default: "False" Whether to report only initial time to first byte metric. ​ send_initial_empty_metrics bool default: "True" Whether to send initial empty metrics frame at pipeline start. ​ start_metadata Dict[str, Any] default: "{}" Additional metadata to include in the StartFrame. ​ Common Configurations ​ Audio Processing Configuration You can set the audio input and output sample rates in the PipelineParams to set the sample rate for all input and output services in the pipeline. This acts as a convenience to avoid setting the sample rate for each service individually. Note, if services are set individually, they will supersede the values set in PipelineParams . Copy Ask AI params = PipelineParams( audio_in_sample_rate = 8000 , # Lower quality input audio audio_out_sample_rate = 8000 # High quality output audio ) ​ Performance Monitoring Configuration Pipeline heartbeats provide a way to monitor the health of your pipeline by sending periodic heartbeat frames through the system. When enabled, the pipeline will send heartbeat frames every second and monitor their progress through the pipeline. Copy Ask AI params = PipelineParams( enable_heartbeats = True , heartbeats_period_secs = 2.0 , # Send heartbeats every 2 seconds enable_metrics = True ) ​ How Parameters Are Used The parameters you set in PipelineParams are passed to various components of the pipeline: StartFrame : Many parameters are included in the StartFrame that initializes the pipeline Metrics Collection : Metrics settings configure what performance data is gathered Heartbeat Monitoring : Controls the pipeline’s health monitoring system Audio Processing : Sample rates affect how audio is processed throughout the pipeline ​ Complete Example Copy Ask AI from pipecat.frames.frames import TTSSpeakFrame from pipecat.observers.file_observer import FileObserver from pipecat.pipeline.task import PipelineParams, PipelineTask from pipecat.pipeline.runner import PipelineRunner # Create comprehensive parameters params = PipelineParams( allow_interruptions = True , audio_in_sample_rate = 8000 , audio_out_sample_rate = 8000 , enable_heartbeats = True , enable_metrics = True , enable_usage_metrics = True , heartbeats_period_secs = 1.0 , report_only_initial_ttfb = False , start_metadata = { "conversation_id" : "conv-123" , "session_data" : { "user_id" : "user-456" , "start_time" : "2023-10-25T14:30:00Z" } } ) # Create pipeline and task pipeline = Pipeline([ ... ]) task = PipelineTask( pipeline, params = params, observers = [FileObserver( "pipeline_logs.jsonl" )] ) # Run the pipeline runner = PipelineRunner() await runner.run(task) ​ Additional Information Parameters are immutable once the pipeline starts The start_metadata dictionary can contain any serializable data For metrics collection to work properly, enable_metrics must be set to True Pipecat Flows PipelineTask On this page Overview Basic Usage Available Parameters Common Configurations Audio Processing Configuration Performance Monitoring Configuration How Parameters Are Used Complete Example Additional Information Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-task_310c3547.txt b/pipeline_pipeline-task_310c3547.txt new file mode 100644 index 0000000000000000000000000000000000000000..37a8a2e1443fffaa2cc707ac3c64b602a908d270 --- /dev/null +++ b/pipeline_pipeline-task_310c3547.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-task#param-params +Title: PipelineTask - Pipecat +================================================== + +PipelineTask - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline PipelineTask Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview PipelineTask is the central class for managing pipeline execution. It handles the lifecycle of the pipeline, processes frames in both directions, manages task cancellation, and provides event handlers for monitoring pipeline activity. ​ Basic Usage Copy Ask AI from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.runner import PipelineRunner from pipecat.pipeline.task import PipelineParams, PipelineTask # Create a pipeline pipeline = Pipeline([ ... ]) # Create a task with the pipeline task = PipelineTask(pipeline) # Queue frames for processing await task.queue_frame(TTSSpeakFrame( "Hello, how can I help you today?" )) # Run the pipeline runner = PipelineRunner() await runner.run(task) ​ Constructor Parameters ​ pipeline BasePipeline required The pipeline to execute. ​ params PipelineParams default: "PipelineParams()" Configuration parameters for the pipeline. See PipelineParams for details. ​ observers List[BaseObserver] default: "[]" List of observers for monitoring pipeline execution. See Observers for details. ​ clock BaseClock default: "SystemClock()" Clock implementation for timing operations. ​ task_manager Optional[BaseTaskManager] default: "None" Custom task manager for handling asyncio tasks. If None, a default TaskManager is used. ​ check_dangling_tasks bool default: "True" Whether to check for processors’ tasks finishing properly. ​ idle_timeout_secs Optional[float] default: "300" Timeout in seconds before considering the pipeline idle. Set to None to disable idle detection. See Pipeline Idle Detection for details. ​ idle_timeout_frames Tuple[Type[Frame], ...] default: "(BotSpeakingFrame, LLMFullResponseEndFrame)" Frame types that should prevent the pipeline from being considered idle. See Pipeline Idle Detection for details. ​ cancel_on_idle_timeout bool default: "True" Whether to automatically cancel the pipeline task when idle timeout is reached. See Pipeline Idle Detection for details. ​ enable_tracing bool default: "False" Whether to enable OpenTelemetry tracing. See The OpenTelemetry guide for details. ​ enable_turn_tracking bool default: "False" Whether to enable turn tracking. See The OpenTelemetry guide for details. ​ conversation_id Optional[str] default: "None" Custom ID for the conversation. If not provided, a UUID will be generated. See The OpenTelemetry guide for details. ​ additional_span_attributes Optional[dict] default: "None" Any additional attributes to add to top-level OpenTelemetry conversation span. See The OpenTelemetry guide for details. ​ Methods ​ Task Lifecycle Management ​ run() async Starts and manages the pipeline execution until completion or cancellation. Copy Ask AI await task.run() ​ stop_when_done() async Sends an EndFrame to the pipeline to gracefully stop the task after all queued frames have been processed. Copy Ask AI await task.stop_when_done() ​ cancel() async Stops the running pipeline immediately by sending a CancelFrame. Copy Ask AI await task.cancel() ​ has_finished() bool Returns whether the task has finished (all processors have stopped). Copy Ask AI if task.has_finished(): print ( "Task is complete" ) ​ Frame Management ​ queue_frame() async Queues a single frame to be pushed down the pipeline. Copy Ask AI await task.queue_frame(TTSSpeakFrame( "Hello!" )) ​ queue_frames() async Queues multiple frames to be pushed down the pipeline. Copy Ask AI frames = [TTSSpeakFrame( "Hello!" ), TTSSpeakFrame( "How are you?" )] await task.queue_frames(frames) ​ Event Handlers PipelineTask provides an event handler that can be registered using the event_handler decorator: ​ on_idle_timeout Triggered when no activity frames (as specified by idle_timeout_frames ) have been received within the idle timeout period. Copy Ask AI @task.event_handler ( "on_idle_timeout" ) async def on_idle_timeout ( task ): print ( "Pipeline has been idle too long" ) await task.queue_frame(TTSSpeakFrame( "Are you still there?" )) PipelineParams Pipeline Idle Detection On this page Overview Basic Usage Constructor Parameters Methods Task Lifecycle Management Frame Management Event Handlers on_idle_timeout Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-task_33e64eaa.txt b/pipeline_pipeline-task_33e64eaa.txt new file mode 100644 index 0000000000000000000000000000000000000000..6b08f40a76209a9206fa5718e5172d65d48d34e9 --- /dev/null +++ b/pipeline_pipeline-task_33e64eaa.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-task#on-idle-timeout +Title: PipelineTask - Pipecat +================================================== + +PipelineTask - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline PipelineTask Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview PipelineTask is the central class for managing pipeline execution. It handles the lifecycle of the pipeline, processes frames in both directions, manages task cancellation, and provides event handlers for monitoring pipeline activity. ​ Basic Usage Copy Ask AI from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.runner import PipelineRunner from pipecat.pipeline.task import PipelineParams, PipelineTask # Create a pipeline pipeline = Pipeline([ ... ]) # Create a task with the pipeline task = PipelineTask(pipeline) # Queue frames for processing await task.queue_frame(TTSSpeakFrame( "Hello, how can I help you today?" )) # Run the pipeline runner = PipelineRunner() await runner.run(task) ​ Constructor Parameters ​ pipeline BasePipeline required The pipeline to execute. ​ params PipelineParams default: "PipelineParams()" Configuration parameters for the pipeline. See PipelineParams for details. ​ observers List[BaseObserver] default: "[]" List of observers for monitoring pipeline execution. See Observers for details. ​ clock BaseClock default: "SystemClock()" Clock implementation for timing operations. ​ task_manager Optional[BaseTaskManager] default: "None" Custom task manager for handling asyncio tasks. If None, a default TaskManager is used. ​ check_dangling_tasks bool default: "True" Whether to check for processors’ tasks finishing properly. ​ idle_timeout_secs Optional[float] default: "300" Timeout in seconds before considering the pipeline idle. Set to None to disable idle detection. See Pipeline Idle Detection for details. ​ idle_timeout_frames Tuple[Type[Frame], ...] default: "(BotSpeakingFrame, LLMFullResponseEndFrame)" Frame types that should prevent the pipeline from being considered idle. See Pipeline Idle Detection for details. ​ cancel_on_idle_timeout bool default: "True" Whether to automatically cancel the pipeline task when idle timeout is reached. See Pipeline Idle Detection for details. ​ enable_tracing bool default: "False" Whether to enable OpenTelemetry tracing. See The OpenTelemetry guide for details. ​ enable_turn_tracking bool default: "False" Whether to enable turn tracking. See The OpenTelemetry guide for details. ​ conversation_id Optional[str] default: "None" Custom ID for the conversation. If not provided, a UUID will be generated. See The OpenTelemetry guide for details. ​ additional_span_attributes Optional[dict] default: "None" Any additional attributes to add to top-level OpenTelemetry conversation span. See The OpenTelemetry guide for details. ​ Methods ​ Task Lifecycle Management ​ run() async Starts and manages the pipeline execution until completion or cancellation. Copy Ask AI await task.run() ​ stop_when_done() async Sends an EndFrame to the pipeline to gracefully stop the task after all queued frames have been processed. Copy Ask AI await task.stop_when_done() ​ cancel() async Stops the running pipeline immediately by sending a CancelFrame. Copy Ask AI await task.cancel() ​ has_finished() bool Returns whether the task has finished (all processors have stopped). Copy Ask AI if task.has_finished(): print ( "Task is complete" ) ​ Frame Management ​ queue_frame() async Queues a single frame to be pushed down the pipeline. Copy Ask AI await task.queue_frame(TTSSpeakFrame( "Hello!" )) ​ queue_frames() async Queues multiple frames to be pushed down the pipeline. Copy Ask AI frames = [TTSSpeakFrame( "Hello!" ), TTSSpeakFrame( "How are you?" )] await task.queue_frames(frames) ​ Event Handlers PipelineTask provides an event handler that can be registered using the event_handler decorator: ​ on_idle_timeout Triggered when no activity frames (as specified by idle_timeout_frames ) have been received within the idle timeout period. Copy Ask AI @task.event_handler ( "on_idle_timeout" ) async def on_idle_timeout ( task ): print ( "Pipeline has been idle too long" ) await task.queue_frame(TTSSpeakFrame( "Are you still there?" )) PipelineParams Pipeline Idle Detection On this page Overview Basic Usage Constructor Parameters Methods Task Lifecycle Management Frame Management Event Handlers on_idle_timeout Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-task_4563c25e.txt b/pipeline_pipeline-task_4563c25e.txt new file mode 100644 index 0000000000000000000000000000000000000000..1949dfcc5698ac5110ca264e380ea0717254e219 --- /dev/null +++ b/pipeline_pipeline-task_4563c25e.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-task#methods +Title: PipelineTask - Pipecat +================================================== + +PipelineTask - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline PipelineTask Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview PipelineTask is the central class for managing pipeline execution. It handles the lifecycle of the pipeline, processes frames in both directions, manages task cancellation, and provides event handlers for monitoring pipeline activity. ​ Basic Usage Copy Ask AI from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.runner import PipelineRunner from pipecat.pipeline.task import PipelineParams, PipelineTask # Create a pipeline pipeline = Pipeline([ ... ]) # Create a task with the pipeline task = PipelineTask(pipeline) # Queue frames for processing await task.queue_frame(TTSSpeakFrame( "Hello, how can I help you today?" )) # Run the pipeline runner = PipelineRunner() await runner.run(task) ​ Constructor Parameters ​ pipeline BasePipeline required The pipeline to execute. ​ params PipelineParams default: "PipelineParams()" Configuration parameters for the pipeline. See PipelineParams for details. ​ observers List[BaseObserver] default: "[]" List of observers for monitoring pipeline execution. See Observers for details. ​ clock BaseClock default: "SystemClock()" Clock implementation for timing operations. ​ task_manager Optional[BaseTaskManager] default: "None" Custom task manager for handling asyncio tasks. If None, a default TaskManager is used. ​ check_dangling_tasks bool default: "True" Whether to check for processors’ tasks finishing properly. ​ idle_timeout_secs Optional[float] default: "300" Timeout in seconds before considering the pipeline idle. Set to None to disable idle detection. See Pipeline Idle Detection for details. ​ idle_timeout_frames Tuple[Type[Frame], ...] default: "(BotSpeakingFrame, LLMFullResponseEndFrame)" Frame types that should prevent the pipeline from being considered idle. See Pipeline Idle Detection for details. ​ cancel_on_idle_timeout bool default: "True" Whether to automatically cancel the pipeline task when idle timeout is reached. See Pipeline Idle Detection for details. ​ enable_tracing bool default: "False" Whether to enable OpenTelemetry tracing. See The OpenTelemetry guide for details. ​ enable_turn_tracking bool default: "False" Whether to enable turn tracking. See The OpenTelemetry guide for details. ​ conversation_id Optional[str] default: "None" Custom ID for the conversation. If not provided, a UUID will be generated. See The OpenTelemetry guide for details. ​ additional_span_attributes Optional[dict] default: "None" Any additional attributes to add to top-level OpenTelemetry conversation span. See The OpenTelemetry guide for details. ​ Methods ​ Task Lifecycle Management ​ run() async Starts and manages the pipeline execution until completion or cancellation. Copy Ask AI await task.run() ​ stop_when_done() async Sends an EndFrame to the pipeline to gracefully stop the task after all queued frames have been processed. Copy Ask AI await task.stop_when_done() ​ cancel() async Stops the running pipeline immediately by sending a CancelFrame. Copy Ask AI await task.cancel() ​ has_finished() bool Returns whether the task has finished (all processors have stopped). Copy Ask AI if task.has_finished(): print ( "Task is complete" ) ​ Frame Management ​ queue_frame() async Queues a single frame to be pushed down the pipeline. Copy Ask AI await task.queue_frame(TTSSpeakFrame( "Hello!" )) ​ queue_frames() async Queues multiple frames to be pushed down the pipeline. Copy Ask AI frames = [TTSSpeakFrame( "Hello!" ), TTSSpeakFrame( "How are you?" )] await task.queue_frames(frames) ​ Event Handlers PipelineTask provides an event handler that can be registered using the event_handler decorator: ​ on_idle_timeout Triggered when no activity frames (as specified by idle_timeout_frames ) have been received within the idle timeout period. Copy Ask AI @task.event_handler ( "on_idle_timeout" ) async def on_idle_timeout ( task ): print ( "Pipeline has been idle too long" ) await task.queue_frame(TTSSpeakFrame( "Are you still there?" )) PipelineParams Pipeline Idle Detection On this page Overview Basic Usage Constructor Parameters Methods Task Lifecycle Management Frame Management Event Handlers on_idle_timeout Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-task_6496025a.txt b/pipeline_pipeline-task_6496025a.txt new file mode 100644 index 0000000000000000000000000000000000000000..8899f548efed001e3e26388b7e43aa1788f8163d --- /dev/null +++ b/pipeline_pipeline-task_6496025a.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-task#param-idle-timeout-secs +Title: PipelineTask - Pipecat +================================================== + +PipelineTask - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline PipelineTask Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview PipelineTask is the central class for managing pipeline execution. It handles the lifecycle of the pipeline, processes frames in both directions, manages task cancellation, and provides event handlers for monitoring pipeline activity. ​ Basic Usage Copy Ask AI from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.runner import PipelineRunner from pipecat.pipeline.task import PipelineParams, PipelineTask # Create a pipeline pipeline = Pipeline([ ... ]) # Create a task with the pipeline task = PipelineTask(pipeline) # Queue frames for processing await task.queue_frame(TTSSpeakFrame( "Hello, how can I help you today?" )) # Run the pipeline runner = PipelineRunner() await runner.run(task) ​ Constructor Parameters ​ pipeline BasePipeline required The pipeline to execute. ​ params PipelineParams default: "PipelineParams()" Configuration parameters for the pipeline. See PipelineParams for details. ​ observers List[BaseObserver] default: "[]" List of observers for monitoring pipeline execution. See Observers for details. ​ clock BaseClock default: "SystemClock()" Clock implementation for timing operations. ​ task_manager Optional[BaseTaskManager] default: "None" Custom task manager for handling asyncio tasks. If None, a default TaskManager is used. ​ check_dangling_tasks bool default: "True" Whether to check for processors’ tasks finishing properly. ​ idle_timeout_secs Optional[float] default: "300" Timeout in seconds before considering the pipeline idle. Set to None to disable idle detection. See Pipeline Idle Detection for details. ​ idle_timeout_frames Tuple[Type[Frame], ...] default: "(BotSpeakingFrame, LLMFullResponseEndFrame)" Frame types that should prevent the pipeline from being considered idle. See Pipeline Idle Detection for details. ​ cancel_on_idle_timeout bool default: "True" Whether to automatically cancel the pipeline task when idle timeout is reached. See Pipeline Idle Detection for details. ​ enable_tracing bool default: "False" Whether to enable OpenTelemetry tracing. See The OpenTelemetry guide for details. ​ enable_turn_tracking bool default: "False" Whether to enable turn tracking. See The OpenTelemetry guide for details. ​ conversation_id Optional[str] default: "None" Custom ID for the conversation. If not provided, a UUID will be generated. See The OpenTelemetry guide for details. ​ additional_span_attributes Optional[dict] default: "None" Any additional attributes to add to top-level OpenTelemetry conversation span. See The OpenTelemetry guide for details. ​ Methods ​ Task Lifecycle Management ​ run() async Starts and manages the pipeline execution until completion or cancellation. Copy Ask AI await task.run() ​ stop_when_done() async Sends an EndFrame to the pipeline to gracefully stop the task after all queued frames have been processed. Copy Ask AI await task.stop_when_done() ​ cancel() async Stops the running pipeline immediately by sending a CancelFrame. Copy Ask AI await task.cancel() ​ has_finished() bool Returns whether the task has finished (all processors have stopped). Copy Ask AI if task.has_finished(): print ( "Task is complete" ) ​ Frame Management ​ queue_frame() async Queues a single frame to be pushed down the pipeline. Copy Ask AI await task.queue_frame(TTSSpeakFrame( "Hello!" )) ​ queue_frames() async Queues multiple frames to be pushed down the pipeline. Copy Ask AI frames = [TTSSpeakFrame( "Hello!" ), TTSSpeakFrame( "How are you?" )] await task.queue_frames(frames) ​ Event Handlers PipelineTask provides an event handler that can be registered using the event_handler decorator: ​ on_idle_timeout Triggered when no activity frames (as specified by idle_timeout_frames ) have been received within the idle timeout period. Copy Ask AI @task.event_handler ( "on_idle_timeout" ) async def on_idle_timeout ( task ): print ( "Pipeline has been idle too long" ) await task.queue_frame(TTSSpeakFrame( "Are you still there?" )) PipelineParams Pipeline Idle Detection On this page Overview Basic Usage Constructor Parameters Methods Task Lifecycle Management Frame Management Event Handlers on_idle_timeout Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-task_7cfe1600.txt b/pipeline_pipeline-task_7cfe1600.txt new file mode 100644 index 0000000000000000000000000000000000000000..b0c2ffbd0446e87e4ff97affd93ed03650d5627a --- /dev/null +++ b/pipeline_pipeline-task_7cfe1600.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-task#frame-management +Title: PipelineTask - Pipecat +================================================== + +PipelineTask - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline PipelineTask Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview PipelineTask is the central class for managing pipeline execution. It handles the lifecycle of the pipeline, processes frames in both directions, manages task cancellation, and provides event handlers for monitoring pipeline activity. ​ Basic Usage Copy Ask AI from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.runner import PipelineRunner from pipecat.pipeline.task import PipelineParams, PipelineTask # Create a pipeline pipeline = Pipeline([ ... ]) # Create a task with the pipeline task = PipelineTask(pipeline) # Queue frames for processing await task.queue_frame(TTSSpeakFrame( "Hello, how can I help you today?" )) # Run the pipeline runner = PipelineRunner() await runner.run(task) ​ Constructor Parameters ​ pipeline BasePipeline required The pipeline to execute. ​ params PipelineParams default: "PipelineParams()" Configuration parameters for the pipeline. See PipelineParams for details. ​ observers List[BaseObserver] default: "[]" List of observers for monitoring pipeline execution. See Observers for details. ​ clock BaseClock default: "SystemClock()" Clock implementation for timing operations. ​ task_manager Optional[BaseTaskManager] default: "None" Custom task manager for handling asyncio tasks. If None, a default TaskManager is used. ​ check_dangling_tasks bool default: "True" Whether to check for processors’ tasks finishing properly. ​ idle_timeout_secs Optional[float] default: "300" Timeout in seconds before considering the pipeline idle. Set to None to disable idle detection. See Pipeline Idle Detection for details. ​ idle_timeout_frames Tuple[Type[Frame], ...] default: "(BotSpeakingFrame, LLMFullResponseEndFrame)" Frame types that should prevent the pipeline from being considered idle. See Pipeline Idle Detection for details. ​ cancel_on_idle_timeout bool default: "True" Whether to automatically cancel the pipeline task when idle timeout is reached. See Pipeline Idle Detection for details. ​ enable_tracing bool default: "False" Whether to enable OpenTelemetry tracing. See The OpenTelemetry guide for details. ​ enable_turn_tracking bool default: "False" Whether to enable turn tracking. See The OpenTelemetry guide for details. ​ conversation_id Optional[str] default: "None" Custom ID for the conversation. If not provided, a UUID will be generated. See The OpenTelemetry guide for details. ​ additional_span_attributes Optional[dict] default: "None" Any additional attributes to add to top-level OpenTelemetry conversation span. See The OpenTelemetry guide for details. ​ Methods ​ Task Lifecycle Management ​ run() async Starts and manages the pipeline execution until completion or cancellation. Copy Ask AI await task.run() ​ stop_when_done() async Sends an EndFrame to the pipeline to gracefully stop the task after all queued frames have been processed. Copy Ask AI await task.stop_when_done() ​ cancel() async Stops the running pipeline immediately by sending a CancelFrame. Copy Ask AI await task.cancel() ​ has_finished() bool Returns whether the task has finished (all processors have stopped). Copy Ask AI if task.has_finished(): print ( "Task is complete" ) ​ Frame Management ​ queue_frame() async Queues a single frame to be pushed down the pipeline. Copy Ask AI await task.queue_frame(TTSSpeakFrame( "Hello!" )) ​ queue_frames() async Queues multiple frames to be pushed down the pipeline. Copy Ask AI frames = [TTSSpeakFrame( "Hello!" ), TTSSpeakFrame( "How are you?" )] await task.queue_frames(frames) ​ Event Handlers PipelineTask provides an event handler that can be registered using the event_handler decorator: ​ on_idle_timeout Triggered when no activity frames (as specified by idle_timeout_frames ) have been received within the idle timeout period. Copy Ask AI @task.event_handler ( "on_idle_timeout" ) async def on_idle_timeout ( task ): print ( "Pipeline has been idle too long" ) await task.queue_frame(TTSSpeakFrame( "Are you still there?" )) PipelineParams Pipeline Idle Detection On this page Overview Basic Usage Constructor Parameters Methods Task Lifecycle Management Frame Management Event Handlers on_idle_timeout Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-task_81448819.txt b/pipeline_pipeline-task_81448819.txt new file mode 100644 index 0000000000000000000000000000000000000000..882028af8df0031e4e5de88efb1b6cc28269e4a3 --- /dev/null +++ b/pipeline_pipeline-task_81448819.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-task#param-stop-when-done +Title: PipelineTask - Pipecat +================================================== + +PipelineTask - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline PipelineTask Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview PipelineTask is the central class for managing pipeline execution. It handles the lifecycle of the pipeline, processes frames in both directions, manages task cancellation, and provides event handlers for monitoring pipeline activity. ​ Basic Usage Copy Ask AI from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.runner import PipelineRunner from pipecat.pipeline.task import PipelineParams, PipelineTask # Create a pipeline pipeline = Pipeline([ ... ]) # Create a task with the pipeline task = PipelineTask(pipeline) # Queue frames for processing await task.queue_frame(TTSSpeakFrame( "Hello, how can I help you today?" )) # Run the pipeline runner = PipelineRunner() await runner.run(task) ​ Constructor Parameters ​ pipeline BasePipeline required The pipeline to execute. ​ params PipelineParams default: "PipelineParams()" Configuration parameters for the pipeline. See PipelineParams for details. ​ observers List[BaseObserver] default: "[]" List of observers for monitoring pipeline execution. See Observers for details. ​ clock BaseClock default: "SystemClock()" Clock implementation for timing operations. ​ task_manager Optional[BaseTaskManager] default: "None" Custom task manager for handling asyncio tasks. If None, a default TaskManager is used. ​ check_dangling_tasks bool default: "True" Whether to check for processors’ tasks finishing properly. ​ idle_timeout_secs Optional[float] default: "300" Timeout in seconds before considering the pipeline idle. Set to None to disable idle detection. See Pipeline Idle Detection for details. ​ idle_timeout_frames Tuple[Type[Frame], ...] default: "(BotSpeakingFrame, LLMFullResponseEndFrame)" Frame types that should prevent the pipeline from being considered idle. See Pipeline Idle Detection for details. ​ cancel_on_idle_timeout bool default: "True" Whether to automatically cancel the pipeline task when idle timeout is reached. See Pipeline Idle Detection for details. ​ enable_tracing bool default: "False" Whether to enable OpenTelemetry tracing. See The OpenTelemetry guide for details. ​ enable_turn_tracking bool default: "False" Whether to enable turn tracking. See The OpenTelemetry guide for details. ​ conversation_id Optional[str] default: "None" Custom ID for the conversation. If not provided, a UUID will be generated. See The OpenTelemetry guide for details. ​ additional_span_attributes Optional[dict] default: "None" Any additional attributes to add to top-level OpenTelemetry conversation span. See The OpenTelemetry guide for details. ​ Methods ​ Task Lifecycle Management ​ run() async Starts and manages the pipeline execution until completion or cancellation. Copy Ask AI await task.run() ​ stop_when_done() async Sends an EndFrame to the pipeline to gracefully stop the task after all queued frames have been processed. Copy Ask AI await task.stop_when_done() ​ cancel() async Stops the running pipeline immediately by sending a CancelFrame. Copy Ask AI await task.cancel() ​ has_finished() bool Returns whether the task has finished (all processors have stopped). Copy Ask AI if task.has_finished(): print ( "Task is complete" ) ​ Frame Management ​ queue_frame() async Queues a single frame to be pushed down the pipeline. Copy Ask AI await task.queue_frame(TTSSpeakFrame( "Hello!" )) ​ queue_frames() async Queues multiple frames to be pushed down the pipeline. Copy Ask AI frames = [TTSSpeakFrame( "Hello!" ), TTSSpeakFrame( "How are you?" )] await task.queue_frames(frames) ​ Event Handlers PipelineTask provides an event handler that can be registered using the event_handler decorator: ​ on_idle_timeout Triggered when no activity frames (as specified by idle_timeout_frames ) have been received within the idle timeout period. Copy Ask AI @task.event_handler ( "on_idle_timeout" ) async def on_idle_timeout ( task ): print ( "Pipeline has been idle too long" ) await task.queue_frame(TTSSpeakFrame( "Are you still there?" )) PipelineParams Pipeline Idle Detection On this page Overview Basic Usage Constructor Parameters Methods Task Lifecycle Management Frame Management Event Handlers on_idle_timeout Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-task_8fce26ac.txt b/pipeline_pipeline-task_8fce26ac.txt new file mode 100644 index 0000000000000000000000000000000000000000..29133ab4a56349e4e36e44e5fd268fc394c779b8 --- /dev/null +++ b/pipeline_pipeline-task_8fce26ac.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-task#event-handlers +Title: PipelineTask - Pipecat +================================================== + +PipelineTask - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline PipelineTask Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview PipelineTask is the central class for managing pipeline execution. It handles the lifecycle of the pipeline, processes frames in both directions, manages task cancellation, and provides event handlers for monitoring pipeline activity. ​ Basic Usage Copy Ask AI from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.runner import PipelineRunner from pipecat.pipeline.task import PipelineParams, PipelineTask # Create a pipeline pipeline = Pipeline([ ... ]) # Create a task with the pipeline task = PipelineTask(pipeline) # Queue frames for processing await task.queue_frame(TTSSpeakFrame( "Hello, how can I help you today?" )) # Run the pipeline runner = PipelineRunner() await runner.run(task) ​ Constructor Parameters ​ pipeline BasePipeline required The pipeline to execute. ​ params PipelineParams default: "PipelineParams()" Configuration parameters for the pipeline. See PipelineParams for details. ​ observers List[BaseObserver] default: "[]" List of observers for monitoring pipeline execution. See Observers for details. ​ clock BaseClock default: "SystemClock()" Clock implementation for timing operations. ​ task_manager Optional[BaseTaskManager] default: "None" Custom task manager for handling asyncio tasks. If None, a default TaskManager is used. ​ check_dangling_tasks bool default: "True" Whether to check for processors’ tasks finishing properly. ​ idle_timeout_secs Optional[float] default: "300" Timeout in seconds before considering the pipeline idle. Set to None to disable idle detection. See Pipeline Idle Detection for details. ​ idle_timeout_frames Tuple[Type[Frame], ...] default: "(BotSpeakingFrame, LLMFullResponseEndFrame)" Frame types that should prevent the pipeline from being considered idle. See Pipeline Idle Detection for details. ​ cancel_on_idle_timeout bool default: "True" Whether to automatically cancel the pipeline task when idle timeout is reached. See Pipeline Idle Detection for details. ​ enable_tracing bool default: "False" Whether to enable OpenTelemetry tracing. See The OpenTelemetry guide for details. ​ enable_turn_tracking bool default: "False" Whether to enable turn tracking. See The OpenTelemetry guide for details. ​ conversation_id Optional[str] default: "None" Custom ID for the conversation. If not provided, a UUID will be generated. See The OpenTelemetry guide for details. ​ additional_span_attributes Optional[dict] default: "None" Any additional attributes to add to top-level OpenTelemetry conversation span. See The OpenTelemetry guide for details. ​ Methods ​ Task Lifecycle Management ​ run() async Starts and manages the pipeline execution until completion or cancellation. Copy Ask AI await task.run() ​ stop_when_done() async Sends an EndFrame to the pipeline to gracefully stop the task after all queued frames have been processed. Copy Ask AI await task.stop_when_done() ​ cancel() async Stops the running pipeline immediately by sending a CancelFrame. Copy Ask AI await task.cancel() ​ has_finished() bool Returns whether the task has finished (all processors have stopped). Copy Ask AI if task.has_finished(): print ( "Task is complete" ) ​ Frame Management ​ queue_frame() async Queues a single frame to be pushed down the pipeline. Copy Ask AI await task.queue_frame(TTSSpeakFrame( "Hello!" )) ​ queue_frames() async Queues multiple frames to be pushed down the pipeline. Copy Ask AI frames = [TTSSpeakFrame( "Hello!" ), TTSSpeakFrame( "How are you?" )] await task.queue_frames(frames) ​ Event Handlers PipelineTask provides an event handler that can be registered using the event_handler decorator: ​ on_idle_timeout Triggered when no activity frames (as specified by idle_timeout_frames ) have been received within the idle timeout period. Copy Ask AI @task.event_handler ( "on_idle_timeout" ) async def on_idle_timeout ( task ): print ( "Pipeline has been idle too long" ) await task.queue_frame(TTSSpeakFrame( "Are you still there?" )) PipelineParams Pipeline Idle Detection On this page Overview Basic Usage Constructor Parameters Methods Task Lifecycle Management Frame Management Event Handlers on_idle_timeout Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-task_987a4cfa.txt b/pipeline_pipeline-task_987a4cfa.txt new file mode 100644 index 0000000000000000000000000000000000000000..91523da50413a4b29522e1dcdab96079de411001 --- /dev/null +++ b/pipeline_pipeline-task_987a4cfa.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-task#param-conversation-id +Title: PipelineTask - Pipecat +================================================== + +PipelineTask - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline PipelineTask Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview PipelineTask is the central class for managing pipeline execution. It handles the lifecycle of the pipeline, processes frames in both directions, manages task cancellation, and provides event handlers for monitoring pipeline activity. ​ Basic Usage Copy Ask AI from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.runner import PipelineRunner from pipecat.pipeline.task import PipelineParams, PipelineTask # Create a pipeline pipeline = Pipeline([ ... ]) # Create a task with the pipeline task = PipelineTask(pipeline) # Queue frames for processing await task.queue_frame(TTSSpeakFrame( "Hello, how can I help you today?" )) # Run the pipeline runner = PipelineRunner() await runner.run(task) ​ Constructor Parameters ​ pipeline BasePipeline required The pipeline to execute. ​ params PipelineParams default: "PipelineParams()" Configuration parameters for the pipeline. See PipelineParams for details. ​ observers List[BaseObserver] default: "[]" List of observers for monitoring pipeline execution. See Observers for details. ​ clock BaseClock default: "SystemClock()" Clock implementation for timing operations. ​ task_manager Optional[BaseTaskManager] default: "None" Custom task manager for handling asyncio tasks. If None, a default TaskManager is used. ​ check_dangling_tasks bool default: "True" Whether to check for processors’ tasks finishing properly. ​ idle_timeout_secs Optional[float] default: "300" Timeout in seconds before considering the pipeline idle. Set to None to disable idle detection. See Pipeline Idle Detection for details. ​ idle_timeout_frames Tuple[Type[Frame], ...] default: "(BotSpeakingFrame, LLMFullResponseEndFrame)" Frame types that should prevent the pipeline from being considered idle. See Pipeline Idle Detection for details. ​ cancel_on_idle_timeout bool default: "True" Whether to automatically cancel the pipeline task when idle timeout is reached. See Pipeline Idle Detection for details. ​ enable_tracing bool default: "False" Whether to enable OpenTelemetry tracing. See The OpenTelemetry guide for details. ​ enable_turn_tracking bool default: "False" Whether to enable turn tracking. See The OpenTelemetry guide for details. ​ conversation_id Optional[str] default: "None" Custom ID for the conversation. If not provided, a UUID will be generated. See The OpenTelemetry guide for details. ​ additional_span_attributes Optional[dict] default: "None" Any additional attributes to add to top-level OpenTelemetry conversation span. See The OpenTelemetry guide for details. ​ Methods ​ Task Lifecycle Management ​ run() async Starts and manages the pipeline execution until completion or cancellation. Copy Ask AI await task.run() ​ stop_when_done() async Sends an EndFrame to the pipeline to gracefully stop the task after all queued frames have been processed. Copy Ask AI await task.stop_when_done() ​ cancel() async Stops the running pipeline immediately by sending a CancelFrame. Copy Ask AI await task.cancel() ​ has_finished() bool Returns whether the task has finished (all processors have stopped). Copy Ask AI if task.has_finished(): print ( "Task is complete" ) ​ Frame Management ​ queue_frame() async Queues a single frame to be pushed down the pipeline. Copy Ask AI await task.queue_frame(TTSSpeakFrame( "Hello!" )) ​ queue_frames() async Queues multiple frames to be pushed down the pipeline. Copy Ask AI frames = [TTSSpeakFrame( "Hello!" ), TTSSpeakFrame( "How are you?" )] await task.queue_frames(frames) ​ Event Handlers PipelineTask provides an event handler that can be registered using the event_handler decorator: ​ on_idle_timeout Triggered when no activity frames (as specified by idle_timeout_frames ) have been received within the idle timeout period. Copy Ask AI @task.event_handler ( "on_idle_timeout" ) async def on_idle_timeout ( task ): print ( "Pipeline has been idle too long" ) await task.queue_frame(TTSSpeakFrame( "Are you still there?" )) PipelineParams Pipeline Idle Detection On this page Overview Basic Usage Constructor Parameters Methods Task Lifecycle Management Frame Management Event Handlers on_idle_timeout Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-task_ae2e01df.txt b/pipeline_pipeline-task_ae2e01df.txt new file mode 100644 index 0000000000000000000000000000000000000000..c81aeb31df0ad9c1db5b9151e60ca44453c0150d --- /dev/null +++ b/pipeline_pipeline-task_ae2e01df.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-task#param-task-manager +Title: PipelineTask - Pipecat +================================================== + +PipelineTask - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline PipelineTask Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview PipelineTask is the central class for managing pipeline execution. It handles the lifecycle of the pipeline, processes frames in both directions, manages task cancellation, and provides event handlers for monitoring pipeline activity. ​ Basic Usage Copy Ask AI from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.runner import PipelineRunner from pipecat.pipeline.task import PipelineParams, PipelineTask # Create a pipeline pipeline = Pipeline([ ... ]) # Create a task with the pipeline task = PipelineTask(pipeline) # Queue frames for processing await task.queue_frame(TTSSpeakFrame( "Hello, how can I help you today?" )) # Run the pipeline runner = PipelineRunner() await runner.run(task) ​ Constructor Parameters ​ pipeline BasePipeline required The pipeline to execute. ​ params PipelineParams default: "PipelineParams()" Configuration parameters for the pipeline. See PipelineParams for details. ​ observers List[BaseObserver] default: "[]" List of observers for monitoring pipeline execution. See Observers for details. ​ clock BaseClock default: "SystemClock()" Clock implementation for timing operations. ​ task_manager Optional[BaseTaskManager] default: "None" Custom task manager for handling asyncio tasks. If None, a default TaskManager is used. ​ check_dangling_tasks bool default: "True" Whether to check for processors’ tasks finishing properly. ​ idle_timeout_secs Optional[float] default: "300" Timeout in seconds before considering the pipeline idle. Set to None to disable idle detection. See Pipeline Idle Detection for details. ​ idle_timeout_frames Tuple[Type[Frame], ...] default: "(BotSpeakingFrame, LLMFullResponseEndFrame)" Frame types that should prevent the pipeline from being considered idle. See Pipeline Idle Detection for details. ​ cancel_on_idle_timeout bool default: "True" Whether to automatically cancel the pipeline task when idle timeout is reached. See Pipeline Idle Detection for details. ​ enable_tracing bool default: "False" Whether to enable OpenTelemetry tracing. See The OpenTelemetry guide for details. ​ enable_turn_tracking bool default: "False" Whether to enable turn tracking. See The OpenTelemetry guide for details. ​ conversation_id Optional[str] default: "None" Custom ID for the conversation. If not provided, a UUID will be generated. See The OpenTelemetry guide for details. ​ additional_span_attributes Optional[dict] default: "None" Any additional attributes to add to top-level OpenTelemetry conversation span. See The OpenTelemetry guide for details. ​ Methods ​ Task Lifecycle Management ​ run() async Starts and manages the pipeline execution until completion or cancellation. Copy Ask AI await task.run() ​ stop_when_done() async Sends an EndFrame to the pipeline to gracefully stop the task after all queued frames have been processed. Copy Ask AI await task.stop_when_done() ​ cancel() async Stops the running pipeline immediately by sending a CancelFrame. Copy Ask AI await task.cancel() ​ has_finished() bool Returns whether the task has finished (all processors have stopped). Copy Ask AI if task.has_finished(): print ( "Task is complete" ) ​ Frame Management ​ queue_frame() async Queues a single frame to be pushed down the pipeline. Copy Ask AI await task.queue_frame(TTSSpeakFrame( "Hello!" )) ​ queue_frames() async Queues multiple frames to be pushed down the pipeline. Copy Ask AI frames = [TTSSpeakFrame( "Hello!" ), TTSSpeakFrame( "How are you?" )] await task.queue_frames(frames) ​ Event Handlers PipelineTask provides an event handler that can be registered using the event_handler decorator: ​ on_idle_timeout Triggered when no activity frames (as specified by idle_timeout_frames ) have been received within the idle timeout period. Copy Ask AI @task.event_handler ( "on_idle_timeout" ) async def on_idle_timeout ( task ): print ( "Pipeline has been idle too long" ) await task.queue_frame(TTSSpeakFrame( "Are you still there?" )) PipelineParams Pipeline Idle Detection On this page Overview Basic Usage Constructor Parameters Methods Task Lifecycle Management Frame Management Event Handlers on_idle_timeout Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-task_b0de8b66.txt b/pipeline_pipeline-task_b0de8b66.txt new file mode 100644 index 0000000000000000000000000000000000000000..e3e93adc7bbb168d2a356e617e0709ed23a7090e --- /dev/null +++ b/pipeline_pipeline-task_b0de8b66.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-task#param-idle-timeout-frames +Title: PipelineTask - Pipecat +================================================== + +PipelineTask - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline PipelineTask Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview PipelineTask is the central class for managing pipeline execution. It handles the lifecycle of the pipeline, processes frames in both directions, manages task cancellation, and provides event handlers for monitoring pipeline activity. ​ Basic Usage Copy Ask AI from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.runner import PipelineRunner from pipecat.pipeline.task import PipelineParams, PipelineTask # Create a pipeline pipeline = Pipeline([ ... ]) # Create a task with the pipeline task = PipelineTask(pipeline) # Queue frames for processing await task.queue_frame(TTSSpeakFrame( "Hello, how can I help you today?" )) # Run the pipeline runner = PipelineRunner() await runner.run(task) ​ Constructor Parameters ​ pipeline BasePipeline required The pipeline to execute. ​ params PipelineParams default: "PipelineParams()" Configuration parameters for the pipeline. See PipelineParams for details. ​ observers List[BaseObserver] default: "[]" List of observers for monitoring pipeline execution. See Observers for details. ​ clock BaseClock default: "SystemClock()" Clock implementation for timing operations. ​ task_manager Optional[BaseTaskManager] default: "None" Custom task manager for handling asyncio tasks. If None, a default TaskManager is used. ​ check_dangling_tasks bool default: "True" Whether to check for processors’ tasks finishing properly. ​ idle_timeout_secs Optional[float] default: "300" Timeout in seconds before considering the pipeline idle. Set to None to disable idle detection. See Pipeline Idle Detection for details. ​ idle_timeout_frames Tuple[Type[Frame], ...] default: "(BotSpeakingFrame, LLMFullResponseEndFrame)" Frame types that should prevent the pipeline from being considered idle. See Pipeline Idle Detection for details. ​ cancel_on_idle_timeout bool default: "True" Whether to automatically cancel the pipeline task when idle timeout is reached. See Pipeline Idle Detection for details. ​ enable_tracing bool default: "False" Whether to enable OpenTelemetry tracing. See The OpenTelemetry guide for details. ​ enable_turn_tracking bool default: "False" Whether to enable turn tracking. See The OpenTelemetry guide for details. ​ conversation_id Optional[str] default: "None" Custom ID for the conversation. If not provided, a UUID will be generated. See The OpenTelemetry guide for details. ​ additional_span_attributes Optional[dict] default: "None" Any additional attributes to add to top-level OpenTelemetry conversation span. See The OpenTelemetry guide for details. ​ Methods ​ Task Lifecycle Management ​ run() async Starts and manages the pipeline execution until completion or cancellation. Copy Ask AI await task.run() ​ stop_when_done() async Sends an EndFrame to the pipeline to gracefully stop the task after all queued frames have been processed. Copy Ask AI await task.stop_when_done() ​ cancel() async Stops the running pipeline immediately by sending a CancelFrame. Copy Ask AI await task.cancel() ​ has_finished() bool Returns whether the task has finished (all processors have stopped). Copy Ask AI if task.has_finished(): print ( "Task is complete" ) ​ Frame Management ​ queue_frame() async Queues a single frame to be pushed down the pipeline. Copy Ask AI await task.queue_frame(TTSSpeakFrame( "Hello!" )) ​ queue_frames() async Queues multiple frames to be pushed down the pipeline. Copy Ask AI frames = [TTSSpeakFrame( "Hello!" ), TTSSpeakFrame( "How are you?" )] await task.queue_frames(frames) ​ Event Handlers PipelineTask provides an event handler that can be registered using the event_handler decorator: ​ on_idle_timeout Triggered when no activity frames (as specified by idle_timeout_frames ) have been received within the idle timeout period. Copy Ask AI @task.event_handler ( "on_idle_timeout" ) async def on_idle_timeout ( task ): print ( "Pipeline has been idle too long" ) await task.queue_frame(TTSSpeakFrame( "Are you still there?" )) PipelineParams Pipeline Idle Detection On this page Overview Basic Usage Constructor Parameters Methods Task Lifecycle Management Frame Management Event Handlers on_idle_timeout Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-task_b4e9648c.txt b/pipeline_pipeline-task_b4e9648c.txt new file mode 100644 index 0000000000000000000000000000000000000000..1f74525e148a6b7b13ba46aaad68fc4e1a903c23 --- /dev/null +++ b/pipeline_pipeline-task_b4e9648c.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-task#param-queue-frames +Title: PipelineTask - Pipecat +================================================== + +PipelineTask - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline PipelineTask Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview PipelineTask is the central class for managing pipeline execution. It handles the lifecycle of the pipeline, processes frames in both directions, manages task cancellation, and provides event handlers for monitoring pipeline activity. ​ Basic Usage Copy Ask AI from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.runner import PipelineRunner from pipecat.pipeline.task import PipelineParams, PipelineTask # Create a pipeline pipeline = Pipeline([ ... ]) # Create a task with the pipeline task = PipelineTask(pipeline) # Queue frames for processing await task.queue_frame(TTSSpeakFrame( "Hello, how can I help you today?" )) # Run the pipeline runner = PipelineRunner() await runner.run(task) ​ Constructor Parameters ​ pipeline BasePipeline required The pipeline to execute. ​ params PipelineParams default: "PipelineParams()" Configuration parameters for the pipeline. See PipelineParams for details. ​ observers List[BaseObserver] default: "[]" List of observers for monitoring pipeline execution. See Observers for details. ​ clock BaseClock default: "SystemClock()" Clock implementation for timing operations. ​ task_manager Optional[BaseTaskManager] default: "None" Custom task manager for handling asyncio tasks. If None, a default TaskManager is used. ​ check_dangling_tasks bool default: "True" Whether to check for processors’ tasks finishing properly. ​ idle_timeout_secs Optional[float] default: "300" Timeout in seconds before considering the pipeline idle. Set to None to disable idle detection. See Pipeline Idle Detection for details. ​ idle_timeout_frames Tuple[Type[Frame], ...] default: "(BotSpeakingFrame, LLMFullResponseEndFrame)" Frame types that should prevent the pipeline from being considered idle. See Pipeline Idle Detection for details. ​ cancel_on_idle_timeout bool default: "True" Whether to automatically cancel the pipeline task when idle timeout is reached. See Pipeline Idle Detection for details. ​ enable_tracing bool default: "False" Whether to enable OpenTelemetry tracing. See The OpenTelemetry guide for details. ​ enable_turn_tracking bool default: "False" Whether to enable turn tracking. See The OpenTelemetry guide for details. ​ conversation_id Optional[str] default: "None" Custom ID for the conversation. If not provided, a UUID will be generated. See The OpenTelemetry guide for details. ​ additional_span_attributes Optional[dict] default: "None" Any additional attributes to add to top-level OpenTelemetry conversation span. See The OpenTelemetry guide for details. ​ Methods ​ Task Lifecycle Management ​ run() async Starts and manages the pipeline execution until completion or cancellation. Copy Ask AI await task.run() ​ stop_when_done() async Sends an EndFrame to the pipeline to gracefully stop the task after all queued frames have been processed. Copy Ask AI await task.stop_when_done() ​ cancel() async Stops the running pipeline immediately by sending a CancelFrame. Copy Ask AI await task.cancel() ​ has_finished() bool Returns whether the task has finished (all processors have stopped). Copy Ask AI if task.has_finished(): print ( "Task is complete" ) ​ Frame Management ​ queue_frame() async Queues a single frame to be pushed down the pipeline. Copy Ask AI await task.queue_frame(TTSSpeakFrame( "Hello!" )) ​ queue_frames() async Queues multiple frames to be pushed down the pipeline. Copy Ask AI frames = [TTSSpeakFrame( "Hello!" ), TTSSpeakFrame( "How are you?" )] await task.queue_frames(frames) ​ Event Handlers PipelineTask provides an event handler that can be registered using the event_handler decorator: ​ on_idle_timeout Triggered when no activity frames (as specified by idle_timeout_frames ) have been received within the idle timeout period. Copy Ask AI @task.event_handler ( "on_idle_timeout" ) async def on_idle_timeout ( task ): print ( "Pipeline has been idle too long" ) await task.queue_frame(TTSSpeakFrame( "Are you still there?" )) PipelineParams Pipeline Idle Detection On this page Overview Basic Usage Constructor Parameters Methods Task Lifecycle Management Frame Management Event Handlers on_idle_timeout Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-task_c48a2c87.txt b/pipeline_pipeline-task_c48a2c87.txt new file mode 100644 index 0000000000000000000000000000000000000000..f90e63e1e7b1ea7fb64ad834769394a7d15f5281 --- /dev/null +++ b/pipeline_pipeline-task_c48a2c87.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-task#param-enable-tracing +Title: PipelineTask - Pipecat +================================================== + +PipelineTask - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline PipelineTask Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview PipelineTask is the central class for managing pipeline execution. It handles the lifecycle of the pipeline, processes frames in both directions, manages task cancellation, and provides event handlers for monitoring pipeline activity. ​ Basic Usage Copy Ask AI from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.runner import PipelineRunner from pipecat.pipeline.task import PipelineParams, PipelineTask # Create a pipeline pipeline = Pipeline([ ... ]) # Create a task with the pipeline task = PipelineTask(pipeline) # Queue frames for processing await task.queue_frame(TTSSpeakFrame( "Hello, how can I help you today?" )) # Run the pipeline runner = PipelineRunner() await runner.run(task) ​ Constructor Parameters ​ pipeline BasePipeline required The pipeline to execute. ​ params PipelineParams default: "PipelineParams()" Configuration parameters for the pipeline. See PipelineParams for details. ​ observers List[BaseObserver] default: "[]" List of observers for monitoring pipeline execution. See Observers for details. ​ clock BaseClock default: "SystemClock()" Clock implementation for timing operations. ​ task_manager Optional[BaseTaskManager] default: "None" Custom task manager for handling asyncio tasks. If None, a default TaskManager is used. ​ check_dangling_tasks bool default: "True" Whether to check for processors’ tasks finishing properly. ​ idle_timeout_secs Optional[float] default: "300" Timeout in seconds before considering the pipeline idle. Set to None to disable idle detection. See Pipeline Idle Detection for details. ​ idle_timeout_frames Tuple[Type[Frame], ...] default: "(BotSpeakingFrame, LLMFullResponseEndFrame)" Frame types that should prevent the pipeline from being considered idle. See Pipeline Idle Detection for details. ​ cancel_on_idle_timeout bool default: "True" Whether to automatically cancel the pipeline task when idle timeout is reached. See Pipeline Idle Detection for details. ​ enable_tracing bool default: "False" Whether to enable OpenTelemetry tracing. See The OpenTelemetry guide for details. ​ enable_turn_tracking bool default: "False" Whether to enable turn tracking. See The OpenTelemetry guide for details. ​ conversation_id Optional[str] default: "None" Custom ID for the conversation. If not provided, a UUID will be generated. See The OpenTelemetry guide for details. ​ additional_span_attributes Optional[dict] default: "None" Any additional attributes to add to top-level OpenTelemetry conversation span. See The OpenTelemetry guide for details. ​ Methods ​ Task Lifecycle Management ​ run() async Starts and manages the pipeline execution until completion or cancellation. Copy Ask AI await task.run() ​ stop_when_done() async Sends an EndFrame to the pipeline to gracefully stop the task after all queued frames have been processed. Copy Ask AI await task.stop_when_done() ​ cancel() async Stops the running pipeline immediately by sending a CancelFrame. Copy Ask AI await task.cancel() ​ has_finished() bool Returns whether the task has finished (all processors have stopped). Copy Ask AI if task.has_finished(): print ( "Task is complete" ) ​ Frame Management ​ queue_frame() async Queues a single frame to be pushed down the pipeline. Copy Ask AI await task.queue_frame(TTSSpeakFrame( "Hello!" )) ​ queue_frames() async Queues multiple frames to be pushed down the pipeline. Copy Ask AI frames = [TTSSpeakFrame( "Hello!" ), TTSSpeakFrame( "How are you?" )] await task.queue_frames(frames) ​ Event Handlers PipelineTask provides an event handler that can be registered using the event_handler decorator: ​ on_idle_timeout Triggered when no activity frames (as specified by idle_timeout_frames ) have been received within the idle timeout period. Copy Ask AI @task.event_handler ( "on_idle_timeout" ) async def on_idle_timeout ( task ): print ( "Pipeline has been idle too long" ) await task.queue_frame(TTSSpeakFrame( "Are you still there?" )) PipelineParams Pipeline Idle Detection On this page Overview Basic Usage Constructor Parameters Methods Task Lifecycle Management Frame Management Event Handlers on_idle_timeout Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/pipeline_pipeline-task_e5b8c9ec.txt b/pipeline_pipeline-task_e5b8c9ec.txt new file mode 100644 index 0000000000000000000000000000000000000000..beec1b1520d317fdd80decad4485dd16dfc52260 --- /dev/null +++ b/pipeline_pipeline-task_e5b8c9ec.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/server/pipeline/pipeline-task#param-pipeline +Title: PipelineTask - Pipecat +================================================== + +PipelineTask - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation Pipeline PipelineTask Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Server API Reference API Reference Reference docs Services Supported Services Transport Serializers Speech-to-Text LLM Text-to-Speech Speech-to-Speech Image Generation Video Memory Vision Analytics & Monitoring Utilities Advanced Frame Processors Audio Processing Frame Filters Metrics and Telemetry MCP Observers Service Utilities Smart Turn Detection Task Handling and Monitoring Telephony Text Aggregators and Filters User and Bot Transcriptions User Interruptions Frameworks RTVI Pipecat Flows Pipeline PipelineParams PipelineTask Pipeline Idle Detection Pipeline Heartbeats ParallelPipeline ​ Overview PipelineTask is the central class for managing pipeline execution. It handles the lifecycle of the pipeline, processes frames in both directions, manages task cancellation, and provides event handlers for monitoring pipeline activity. ​ Basic Usage Copy Ask AI from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.runner import PipelineRunner from pipecat.pipeline.task import PipelineParams, PipelineTask # Create a pipeline pipeline = Pipeline([ ... ]) # Create a task with the pipeline task = PipelineTask(pipeline) # Queue frames for processing await task.queue_frame(TTSSpeakFrame( "Hello, how can I help you today?" )) # Run the pipeline runner = PipelineRunner() await runner.run(task) ​ Constructor Parameters ​ pipeline BasePipeline required The pipeline to execute. ​ params PipelineParams default: "PipelineParams()" Configuration parameters for the pipeline. See PipelineParams for details. ​ observers List[BaseObserver] default: "[]" List of observers for monitoring pipeline execution. See Observers for details. ​ clock BaseClock default: "SystemClock()" Clock implementation for timing operations. ​ task_manager Optional[BaseTaskManager] default: "None" Custom task manager for handling asyncio tasks. If None, a default TaskManager is used. ​ check_dangling_tasks bool default: "True" Whether to check for processors’ tasks finishing properly. ​ idle_timeout_secs Optional[float] default: "300" Timeout in seconds before considering the pipeline idle. Set to None to disable idle detection. See Pipeline Idle Detection for details. ​ idle_timeout_frames Tuple[Type[Frame], ...] default: "(BotSpeakingFrame, LLMFullResponseEndFrame)" Frame types that should prevent the pipeline from being considered idle. See Pipeline Idle Detection for details. ​ cancel_on_idle_timeout bool default: "True" Whether to automatically cancel the pipeline task when idle timeout is reached. See Pipeline Idle Detection for details. ​ enable_tracing bool default: "False" Whether to enable OpenTelemetry tracing. See The OpenTelemetry guide for details. ​ enable_turn_tracking bool default: "False" Whether to enable turn tracking. See The OpenTelemetry guide for details. ​ conversation_id Optional[str] default: "None" Custom ID for the conversation. If not provided, a UUID will be generated. See The OpenTelemetry guide for details. ​ additional_span_attributes Optional[dict] default: "None" Any additional attributes to add to top-level OpenTelemetry conversation span. See The OpenTelemetry guide for details. ​ Methods ​ Task Lifecycle Management ​ run() async Starts and manages the pipeline execution until completion or cancellation. Copy Ask AI await task.run() ​ stop_when_done() async Sends an EndFrame to the pipeline to gracefully stop the task after all queued frames have been processed. Copy Ask AI await task.stop_when_done() ​ cancel() async Stops the running pipeline immediately by sending a CancelFrame. Copy Ask AI await task.cancel() ​ has_finished() bool Returns whether the task has finished (all processors have stopped). Copy Ask AI if task.has_finished(): print ( "Task is complete" ) ​ Frame Management ​ queue_frame() async Queues a single frame to be pushed down the pipeline. Copy Ask AI await task.queue_frame(TTSSpeakFrame( "Hello!" )) ​ queue_frames() async Queues multiple frames to be pushed down the pipeline. Copy Ask AI frames = [TTSSpeakFrame( "Hello!" ), TTSSpeakFrame( "How are you?" )] await task.queue_frames(frames) ​ Event Handlers PipelineTask provides an event handler that can be registered using the event_handler decorator: ​ on_idle_timeout Triggered when no activity frames (as specified by idle_timeout_frames ) have been received within the idle timeout period. Copy Ask AI @task.event_handler ( "on_idle_timeout" ) async def on_idle_timeout ( task ): print ( "Pipeline has been idle too long" ) await task.queue_frame(TTSSpeakFrame( "Are you still there?" )) PipelineParams Pipeline Idle Detection On this page Overview Basic Usage Constructor Parameters Methods Task Lifecycle Management Frame Management Event Handlers on_idle_timeout Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/react-native_introduction_7b5bf05a.txt b/react-native_introduction_7b5bf05a.txt new file mode 100644 index 0000000000000000000000000000000000000000..6b30a5f05113437b3e8a0955f3ffe651e6a8e9c6 --- /dev/null +++ b/react-native_introduction_7b5bf05a.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/client/react-native/introduction#requirements +Title: SDK Introduction - Pipecat +================================================== + +SDK Introduction - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation React Native SDK SDK Introduction Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Client SDKs The RTVI Standard RTVIClient Migration Guide Javascript SDK SDK Introduction API Reference Transport packages React SDK SDK Introduction API Reference React Native SDK SDK Introduction API Reference iOS SDK SDK Introduction API Reference Transport packages Android SDK SDK Introduction API Reference Transport packages C++ SDK SDK Introduction Daily WebRTC Transport The Pipecat React Native SDK leverages the Pipecat JavaScript SDK to provide seamless integration for React Native applications. Since the JavaScript SDK is designed to work across both web and React Native platforms, the core functionalities remain the same: Device and media stream management Managing bot configuration Sending actions to the bot Handling bot messages and responses Managing session state and errors The primary difference lies in the transport layer, which is tailored to support the unique requirements of the React Native environment. For example, when using the SDK with React Native, you would install RNDailyTransport instead of DailyTransport . ​ Installation Install the SDK and a transport implementation (e.g. Daily for WebRTC): Copy Ask AI npm i @pipecat-ai/react-native-daily-transport npm i @daily-co/react-native-daily-js@^0.70.0 npm i @daily-co/react-native-webrtc@^118.0.3-daily.2 npm i @react-native-async-storage/async-storage@^1.23.1 npm i react-native-background-timer@^2.4.1 npm i react-native-get-random-values@^1.11.0 Installing @pipecat-ai/react-native-daily-transport automatically includes the corresponding version of the JavaScript SDK. If you are using Expo, you will also need to add the following dependencies: Copy Ask AI npm i @config-plugins/react-native-webrtc@^10.0.0 npm i @daily-co/config-plugin-rn-daily-js@0.0.7 ​ Requirements This package introduces some constraints on what OS/SDK versions your project can support: iOS: Deployment target >= 13 Android: minSdkVersion >= 24 ​ Quick start Here’s a simple example using Daily as the transport layer: Copy Ask AI import { RNDailyTransport } from '@pipecat-ai/react-native-daily-transport' ; import { RTVIClient } from '@pipecat-ai/client-js' ; // Create and configure the client let voiceClient = new RTVIClient ({ params: { baseUrl: process . env . PIPECAT_API_URL || "/api" , }, transport: new RNDailyTransport (), enableMic: true }); // Connect to your bot await voiceClient . connect (); You can find a basic working example here and a more comprehensive example here . ​ Explore the SDK The Pipecat React Native SDK leverages the Pipecat JavaScript SDK for seamless integration with React Native applications. For detailed information, refer to our JavaScript documentation. Just ensure you use the appropriate transport layer for React Native. Client Constructor Configure your client instance with transport and callbacks Client Methods Core methods for interacting with your bot API Reference Detailed documentation of all available APIs Helpers Utility functions for common operations Hooks API Reference On this page Installation Requirements Quick start Explore the SDK Assistant Responses are generated using AI and may contain mistakes. \ No newline at end of file diff --git a/react_components_0a2db5b0.txt b/react_components_0a2db5b0.txt new file mode 100644 index 0000000000000000000000000000000000000000..8aeeee793b2db18a32006e8e7476450e24629193 --- /dev/null +++ b/react_components_0a2db5b0.txt @@ -0,0 +1,5 @@ +URL: https://docs.pipecat.ai/client/react/components#param-disabled-1 +Title: Components - Pipecat +================================================== + +Components - Pipecat Pipecat home page Search... ⌘ K Ask AI Search... Navigation API Reference Components Getting Started Guides Server APIs Client SDKs Community GitHub Examples Changelog Client SDKs The RTVI Standard RTVIClient Migration Guide Javascript SDK SDK Introduction API Reference Transport packages React SDK SDK Introduction API Reference Components Hooks React Native SDK SDK Introduction API Reference iOS SDK SDK Introduction API Reference Transport packages Android SDK SDK Introduction API Reference Transport packages C++ SDK SDK Introduction Daily WebRTC Transport The Pipecat React SDK provides several components for handling audio, video, and visualization in your application. ​ PipecatClientProvider The root component for providing Pipecat client context to your application. Copy Ask AI < PipecatClientProvider client = { pcClient } > { /* Child components */ } Props ​ client PipecatClient required A singleton instance of PipecatClient ​ PipecatClientAudio Creates a new