Deepgram

Overview

Deepgram provides three STT service implementations:

DeepgramSTTService for real-time speech recognition using Deepgram’s standard WebSocket API with support for interim results, language detection, and voice activity detection (VAD)
DeepgramFluxSTTService for advanced conversational AI with Flux capabilities including intelligent turn detection, eager end-of-turn events, and enhanced speech processing for improved response timing
DeepgramSageMakerSTTService for real-time speech recognition using Deepgram models deployed on AWS SageMaker endpoints via HTTP/2 bidirectional streaming

Since Deepgram Flux provides its own user turn start and end detection, you should use ExternalUserTurnStrategies to let Flux handle turn management. See User Turn Strategies for configuration details.

Deepgram STT API Reference

Pipecat’s API methods for standard Deepgram STT

Deepgram Flux API Reference

Pipecat’s API methods for Deepgram Flux STT

Standard STT Example

Complete example with standard Deepgram STT

Flux STT Example

Complete example with Deepgram Flux STT

SageMaker Example

Complete example with Deepgram on SageMaker

Deepgram Documentation

Official Deepgram documentation and features

Deepgram Console

Access API keys and transcription models

Installation

To use Deepgram STT services, install the required dependencies:

pip install "pipecat-ai[deepgram]"

For the SageMaker variant, install both the Deepgram and SageMaker dependencies:

pip install "pipecat-ai[deepgram,sagemaker]"

Prerequisites

Deepgram Account Setup

Before using DeepgramSTTService or DeepgramFluxSTTService, you need:

Deepgram Account: Sign up at Deepgram Console
API Key: Generate an API key from your console dashboard
Model Selection: Choose from available transcription models and features

Required Environment Variables

DEEPGRAM_API_KEY: Your Deepgram API key for authentication

AWS SageMaker Setup

Before using DeepgramSageMakerSTTService, you need:

AWS Account: With credentials configured (via environment variables, AWS CLI, or instance metadata)
SageMaker Endpoint: A deployed SageMaker endpoint with a Deepgram model
Deepgram SDK: The Deepgram SDK is required for LiveOptions configuration

Configuration

DeepgramSTTService

api_key

str

required

Deepgram API key for authentication.

base_url

str

default:"\"\""

Custom Deepgram API base URL. Leave empty for the default endpoint.

sample_rate

int

default:"None"

Audio sample rate in Hz. When None, uses the value from live_options or the pipeline’s configured sample rate.

live_options

LiveOptions

default:"None"

Deepgram LiveOptions for detailed configuration. When provided, these settings are merged with the defaults. See Deepgram LiveOptions for available options.

addons

Dict

default:"None"

Additional Deepgram features to enable.

ttfs_p99_latency

float

default:"DEEPGRAM_TTFS_P99"

P99 latency from speech end to final transcript in seconds. Override for your deployment.

The default LiveOptions are:

Option	Default	Description
`encoding`	`"linear16"`	Audio encoding format.
`language`	`Language.EN`	Recognition language.
`model`	`"nova-3-general"`	Deepgram model to use.
`channels`	`1`	Number of audio channels.
`interim_results`	`True`	Stream partial recognition results.
`smart_format`	`False`	Apply smart formatting.
`punctuate`	`True`	Add punctuation to transcripts.
`profanity_filter`	`True`	Filter profanity from transcripts.
`vad_events`	`False`	Enable Deepgram’s built-in VAD events (deprecated).

DeepgramFluxSTTService

api_key

str

required

Deepgram API key for authentication.

url

str

default:"wss://api.deepgram.com/v2/listen"

WebSocket URL for the Deepgram Flux API.

sample_rate

int

default:"None"

Audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.

model

str

default:"flux-general-en"

Deepgram Flux model to use for transcription.

flux_encoding

str

default:"linear16"

Audio encoding format required by the Flux API. Must be "linear16".

params

InputParams

default:"None"

Configuration parameters for the Flux API. See Flux InputParams below.

should_interrupt

bool

default:"True"

Whether the bot should be interrupted when Flux detects user speech.

Flux InputParams

Parameters passed via the params constructor argument for DeepgramFluxSTTService.

Parameter	Type	Default	Description
`eager_eot_threshold`	`float`	`None`	EagerEndOfTurn threshold. Lower values trigger faster responses with more LLM calls; higher values are more conservative. `None` disables EagerEndOfTurn.
`eot_threshold`	`float`	`None`	End-of-turn confidence threshold (default 0.7). Lower = faster turn endings.
`eot_timeout_ms`	`int`	`None`	Time in ms after speech to finish a turn regardless of confidence (default 5000).
`keyterm`	`list`	`[]`	Key terms to boost recognition accuracy for specialized terminology.
`mip_opt_out`	`bool`	`None`	Opt out of Deepgram’s Model Improvement Program.
`tag`	`list`	`[]`	Tags for request identification during usage reporting.
`min_confidence`	`float`	`None`	Minimum average confidence required to produce a `TranscriptionFrame`.

DeepgramSageMakerSTTService

endpoint_name

str

required

Name of the SageMaker endpoint with Deepgram model deployed.

region

str

required

AWS region where the SageMaker endpoint is deployed (e.g., "us-east-2").

sample_rate

int

default:"None"

Audio sample rate in Hz. When None, uses the value from live_options or the pipeline’s configured sample rate.

live_options

LiveOptions

default:"None"

Deepgram LiveOptions for detailed configuration. When provided, these settings are merged with the defaults. See Deepgram LiveOptions for available options.

ttfs_p99_latency

float

default:"DEEPGRAM_SAGEMAKER_TTFS_P99"

P99 latency from speech end to final transcript in seconds. Override for your deployment.

The default LiveOptions for the SageMaker variant are:

Option	Default	Description
`encoding`	`"linear16"`	Audio encoding format.
`language`	`Language.EN`	Recognition language.
`model`	`"nova-3"`	Deepgram model to use.
`channels`	`1`	Number of audio channels.
`interim_results`	`True`	Stream partial recognition results.
`punctuate`	`True`	Add punctuation to transcripts.

Usage

Basic DeepgramSTTService

from pipecat.services.deepgram import DeepgramSTTService

stt = DeepgramSTTService(
    api_key=os.getenv("DEEPGRAM_API_KEY"),
)

With Custom LiveOptions

from deepgram import LiveOptions
from pipecat.services.deepgram import DeepgramSTTService

stt = DeepgramSTTService(
    api_key=os.getenv("DEEPGRAM_API_KEY"),
    live_options=LiveOptions(
        model="nova-3-general",
        language="es",
        punctuate=True,
        smart_format=True,
    ),
)

DeepgramFluxSTTService

from pipecat.services.deepgram.flux import DeepgramFluxSTTService

stt = DeepgramFluxSTTService(
    api_key=os.getenv("DEEPGRAM_API_KEY"),
)

Flux with EagerEndOfTurn

from pipecat.services.deepgram.flux import DeepgramFluxSTTService

stt = DeepgramFluxSTTService(
    api_key=os.getenv("DEEPGRAM_API_KEY"),
    params=DeepgramFluxSTTService.InputParams(
        eager_eot_threshold=0.5,
        eot_threshold=0.8,
        keyterm=["Pipecat", "Deepgram"],
    ),
)

SageMaker Service

from deepgram import LiveOptions
from pipecat.services.deepgram.stt_sagemaker import DeepgramSageMakerSTTService

stt = DeepgramSageMakerSTTService(
    endpoint_name=os.getenv("SAGEMAKER_STT_ENDPOINT_NAME"),
    region=os.getenv("AWS_REGION"),
    live_options=LiveOptions(
        model="nova-3",
        language="en",
        interim_results=True,
        punctuate=True,
    ),
)

Notes

Finalize on VAD stop: When the pipeline’s VAD detects the user has stopped speaking, DeepgramSTTService and DeepgramSageMakerSTTService send a finalize request to Deepgram for faster final transcript delivery.
Flux turn management: DeepgramFluxSTTService provides its own turn detection via StartOfTurn/EndOfTurn events and broadcasts UserStartedSpeakingFrame/UserStoppedSpeakingFrame directly. Use ExternalUserTurnStrategies to avoid conflicting VAD-based turn management.
EagerEndOfTurn: In Flux, enabling eager_eot_threshold provides faster response times by predicting end-of-turn before it is confirmed. EagerEndOfTurn transcripts are pushed as InterimTranscriptionFrames. If the user resumes speaking, a TurnResumed event is fired.
Deprecated vad_events: The vad_events option in standard DeepgramSTTService is deprecated. Use Silero VAD instead.
SageMaker deployment: The SageMaker service requires a Deepgram model deployed to an AWS SageMaker endpoint. See the Deepgram SageMaker deployment guide for setup instructions.
SageMaker keepalive: The SageMaker service automatically sends KeepAlive messages every 5 seconds to maintain the connection during periods of silence.

Event Handlers

All three services support the standard service connection events (on_connected, on_disconnected, on_connection_error). Additionally, DeepgramSTTService and DeepgramFluxSTTService provide service-specific events:

DeepgramSTTService

Event	Description
`on_speech_started`	Speech detected in the audio stream
`on_utterance_end`	End of utterance detected by Deepgram

@stt.event_handler("on_speech_started")
async def on_speech_started(service):
    print("User started speaking")

@stt.event_handler("on_utterance_end")
async def on_utterance_end(service):
    print("Utterance ended")

DeepgramFluxSTTService

Deepgram Flux provides turn-level events for more granular conversation tracking:

Event	Description
`on_start_of_turn`	Start of a new turn detected
`on_turn_resumed`	A previously paused turn has resumed
`on_end_of_turn`	End of turn detected
`on_eager_end_of_turn`	Early end-of-turn prediction
`on_update`	Transcript updated

@stt.event_handler("on_start_of_turn")
async def on_start_of_turn(service, transcript):
    print(f"Turn started: {transcript}")

@stt.event_handler("on_end_of_turn")
async def on_end_of_turn(service, transcript):
    print(f"Turn ended: {transcript}")

@stt.event_handler("on_eager_end_of_turn")
async def on_eager_end_of_turn(service, transcript):
    print(f"Early end-of-turn prediction: {transcript}")

Turn events receive (service, transcript) where transcript is the current transcript text. The on_turn_resumed event receives only (service).

API Reference

Services

Utilities

Frameworks

Pipeline

Overview

Deepgram STT API Reference

Deepgram Flux API Reference

Standard STT Example

Flux STT Example

SageMaker Example

Deepgram Documentation

Deepgram Console

Installation

Prerequisites

Deepgram Account Setup

Required Environment Variables

AWS SageMaker Setup

Configuration

DeepgramSTTService

DeepgramFluxSTTService

Flux InputParams

DeepgramSageMakerSTTService

Usage

Basic DeepgramSTTService

With Custom LiveOptions

DeepgramFluxSTTService

Flux with EagerEndOfTurn

SageMaker Service

Notes

Event Handlers

DeepgramSTTService

DeepgramFluxSTTService

API Reference

Services

Utilities

Frameworks

Pipeline

​Overview

Deepgram STT API Reference

Deepgram Flux API Reference

Standard STT Example

Flux STT Example

SageMaker Example

Deepgram Documentation

Deepgram Console

​Installation

​Prerequisites

​Deepgram Account Setup

​Required Environment Variables

​AWS SageMaker Setup

​Configuration

​DeepgramSTTService

​DeepgramFluxSTTService

​Flux InputParams

​DeepgramSageMakerSTTService

​Usage

​Basic DeepgramSTTService

​With Custom LiveOptions

​DeepgramFluxSTTService

​Flux with EagerEndOfTurn

​SageMaker Service

​Notes

​Event Handlers

​DeepgramSTTService

​DeepgramFluxSTTService

Overview

Installation

Prerequisites

Deepgram Account Setup

Required Environment Variables

AWS SageMaker Setup

Configuration

DeepgramSTTService

DeepgramFluxSTTService

Flux InputParams

DeepgramSageMakerSTTService

Usage

Basic DeepgramSTTService

With Custom LiveOptions

DeepgramFluxSTTService

Flux with EagerEndOfTurn

SageMaker Service

Notes

Event Handlers

DeepgramSTTService

DeepgramFluxSTTService