Voice AI Agent Architecture: Streaming Deepgram → OpenAI → ElevenLabs in Production

A live technical walkthrough of building a production voice AI agent, detailing orchestration of Deepgram, OpenAI, and ElevenLabs with n8n and Supabase.

Overview

Live technical walkthrough of building RENATA, a production voice AI agent handling real property management queries.
I’ll show the actual implementation: how we orchestrate Retell AI webhooks with n8n workflows to process incoming calls, stream audio through Deepgram for real-time transcription, query Supabase vector stores for property-specific context, inject that context into OpenAI function calls, and stream responses back through ElevenLabs for natural voice output.
Code walkthrough includes:

n8n JavaScript nodes for parsing conversation state and building dynamic prompts
Supabase queries that retrieve property data based on conversation context
OpenAI function calling schema for actions (book reservation, escalate to human, check availability)
Retell AI custom LLM integration for handling interruptions and natural conversation flow
WhatsApp Business API webhook handling for message routing
Error handling and fallback logic when systems don’t respond

I’ll show the messy parts: how we handle race conditions in real-time voice, the hacky way we maintain conversation state across n8n executions, and the specific prompt engineering tricks that make the agent actually useful instead of generically polite.
Demo includes live code, actual n8n workflows, database schemas, and API calls.

Tech stack