How to Set Up Speech-to-Text Input with Text-Only Output in LiveKit Agents

Last updated: August 18, 2025

You can configure a LiveKit Agent to accept audio input (speech-to-text) while responding only with text. Here's how to set it up and receive the responses:

Configuration

1. Disable audio output by setting audio_enabled=False in RoomOutputOptions

2. The agent will publish text responses to the lk.transcription text stream topic, without a lk.transcribed_track_id attribute

Receiving Agent Responses

To receive the agent's text responses, you need to listen to the lk.transcription text stream topic. The built-in playground UI uses legacy transcription events and won't display responses when audio track publishing is disabled.

Example Implementation

You can find example implementations here:

Speech-to-text agent setup: Transcriber Example
Text stream receiver: Chat Stream Receiver Example

Note: If you're using the console playground and don't see agent responses to audio input, this is expected behavior. You must implement a custom receiver to listen to the lk.transcription text stream topic.