How to capture real-time agent transcriptions during conversations
Last updated: August 18, 2025
When building conversation history features, there are several ways to capture agent transcriptions in real-time, even when users disconnect mid-conversation.
Using conversation_item_added Events
The simplest approach is to use the conversation_item_added event. In the latest agent version, with RoomInputOptions.close_on_disconnect enabled (enabled by default), you'll receive the incomplete agent turn immediately when a user leaves the room.
Manual Session Closure
Alternatively, you can manually trigger the capture of an incomplete agent turn by calling:
await session.aclose()This will stop the agent and flush any incomplete agent turn.
Real-time Transcription Options
For more granular real-time transcription:
Use the
transcription_nodeto collect transcripts as they are generatedFor TTS-aligned transcriptions, you can implement solutions using Cartesia or ElevenLabs (note: this feature is not available with NovaSonic)
For detailed implementation of TTS-aligned transcriptions, refer to our documentation on TTS-aligned transcriptions.