A working AI dashboard is mostly not the model. It is the boring plumbing around the model — streaming, auth, webhooks, retries, observability — that decides whether your product feels alive or feels broken. Here is how I have been building these dashboards at Technosmart and on freelance work.
What the dashboard actually does
The version I ship has three jobs, and an architecture should be evaluated on how cleanly it handles all three:
- Configure AI workers / agents (prompts, voices, tools, escalation rules).
- Monitor live calls and conversations with token-level streaming so the operator sees what the AI is saying as it says it.
- Review historical interactions, with search, filters, and quick-replay.
Most "AI dashboards" online demos only do step 1. Steps 2 and 3 are where production lives.
The stack, and why
- Next.js 15 (App Router) on the frontend. Server actions for mutations, route handlers for streaming, RSC for the heavy data fetches.
- Node.js + Fastify backend for everything that has to live outside the Next.js process — webhooks, long-running jobs, queued tasks.
- PostgreSQL for state. pgvector for transcript embedding search.
- Server-Sent Events for live streaming. Not WebSockets. We will come back to that choice.
- Retell AI for the actual voice runtime; our dashboard wraps it.
The streaming question: SSE or WebSockets?
The first big choice. Both work. I default to SSE for AI dashboards because:
- It is one-directional (server → browser), which is exactly the model's behaviour.
- HTTP/2 multiplexing makes connection cost trivial.
- It survives load balancers and CDNs that hate WebSockets.
- The browser auto-reconnects. WebSockets need you to write that yourself.
WebSockets are right when the user is sending lots of small messages back. For a dashboard where the user mostly watches and the AI mostly talks, SSE is calmer.
The streaming endpoint, simplified
A Next.js route handler that proxies model tokens to the browser:
// app/api/agent/stream/route.ts
export async function POST(req: Request) {
const { workerId, message } = await req.json();
const auth = await requireSession();
const encoder = new TextEncoder();
const stream = new ReadableStream({
async start(controller) {
const upstream = await openai.chat.completions.create({
model: pickModel(workerId),
stream: true,
messages: await buildMessages(workerId, message, auth.workspaceId),
});
for await (const chunk of upstream) {
const token = chunk.choices[0]?.delta?.content ?? '';
if (token) {
controller.enqueue(
encoder.encode(`data: ${JSON.stringify({ t: token })}\n\n`)
);
}
}
controller.enqueue(encoder.encode('event: done\ndata: {}\n\n'));
controller.close();
},
});
return new Response(stream, {
headers: {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache, no-transform',
Connection: 'keep-alive',
},
});
}
Receiving on the client
The browser side is short. The trick is to write tokens to a local buffer, not to React state, and flush via requestAnimationFrame — otherwise React re-renders every single token and your UI starts to wobble.
const [text, setText] = useState('');
const bufferRef = useRef('');
useEffect(() => {
let raf: number;
const flush = () => {
if (bufferRef.current) {
setText((t) => t + bufferRef.current);
bufferRef.current = '';
}
raf = requestAnimationFrame(flush);
};
raf = requestAnimationFrame(flush);
return () => cancelAnimationFrame(raf);
}, []);
useEffect(() => {
const es = new EventSource('/api/agent/stream?id=' + workerId);
es.onmessage = (e) => {
const { t } = JSON.parse(e.data);
bufferRef.current += t;
};
es.addEventListener('done', () => es.close());
return () => es.close();
}, [workerId]);
Webhooks: the thing that always breaks
Retell and most voice platforms call your webhook with call events: call.started, call.ended, transcript.partial, etc. Three rules I have made painful mistakes ignoring:
- Verify the signature. Every provider includes one. Reject the request if it does not match — bots will absolutely send fake events at your webhook URL once it is public.
- Respond in under 3 seconds. Most providers retry on a timeout. If your handler does heavy work, push it to a queue and return 200 immediately.
- Idempotency. Events will arrive twice. Store the event ID, ignore duplicates.
Auth: do not skip this even for an internal tool
Use Auth.js v5 (formerly NextAuth). Argon2id for password hashing. Session cookies, not JWT in localStorage. Yes, even for an internal dashboard — laptops get lost, and a sloppy auth model becomes the breach story you tell at the next job.
Observability
For a dashboard, I instrument three things from day one:
- Every model call (model, tokens, latency, feature, cache hit).
- Every webhook (provider, event type, processing time, retry count).
- Every UI error (the streaming bar froze, the SSE reconnected, etc.) via the browser's
reportErrorhook.
None of this needs Datadog. PostgreSQL + a tiny Grafana panel is more than enough until you outgrow it.
What I would not do again
- Treat the model as the product. The dashboard is the product. The model is one library it uses.
- Render every streamed token as React state. Use a buffer + RAF.
- Trust the provider's webhook ordering. Re-sort on event timestamp before applying.
Final architecture diagram
[Operator browser]
↕ SSE for streaming, HTTPS for actions
[Next.js App Router] ── Server Actions for mutations
│
├─→ [Node + Fastify worker pool]
│ ↑
│ └── Queue for slow work
│
├─→ [Retell AI runtime] ── webhook → /api/webhook/retell
│
└─→ [PostgreSQL + pgvector]
This is genuinely the stack I ship in production. If you want one built for your business — voice, chat, internal ops dashboard — start the conversation in the contact section on the homepage.