Insights
Voice AI · Compliance

How to ship voice AI inside a regulated perimeter (without lying to your compliance officer).

The reference architecture, the consent patterns, and the audit logging shape we use for regulated voice deployments — distilled from the Hayah engagement.

Moussa Sangare · 12 min read · Mar 15, 2026

The problem nobody puts on the demo

Every voice AI vendor demo opens with a frictionless customer call: clean audio, a happy customer, a tidy transcript. Then you sign the contract and try to put that same agent in front of a regulated insurance flow, and the wheels come off. PII spills into transcripts. Consent isn't captured. Audit logs are JSON blobs nobody can query. The compliance officer asks one question — "prove to me this call was lawful" — and the whole stack falls silent.

This is the part of voice AI that nobody publishes a tutorial about. So here's ours.

The reference architecture we ship

We build every regulated voice deployment on three layers, in this order:

  1. Telephony + media — usually Twilio or a regional SIP trunk. The job here is just to land the audio safely inside our perimeter and tag every call with a call_id that lives forever.
  2. Inference + orchestration — Ultravox for speech-to-speech, with a thin orchestrator (TypeScript service, no framework) that holds the state machine. Why our own state machine and not LangGraph? Because regulated flows have hard rules — you cannot quote a premium before consent is captured — and we want those rules in plain code that an auditor can read.
  3. Storage + audit — Postgres for the operational data, S3 (with object lock + KMS) for the raw audio, and a separate append-only audit_events table that no service can UPDATE or DELETE from.

The thing that makes this work isn't any one piece. It's that consent, retention, and audit are first-class concerns from the schema up — not a logging middleware bolted on at the end.

Consent, the way the regulator actually wants it

Consent in voice AI has three moments, and you must capture all three:

Each of these lands in audit_events as a row with event_type, consent_version, prompt_version, agent_version, transcript_offset_ms. When the regulator asks, you can reconstruct exactly what was said, when, by whom, and what the customer agreed to.

Audit logging that survives a real audit

The trap with audit logging is writing too much and querying too little. We log five things, religiously:

That's it. No request/response dumps. No "info" logs. Audit is for proving compliance, not for debugging. Debugging logs live somewhere else with a 30-day retention.

What we'd do differently next time

After Hayah we changed two things in the template. First, we now version the consent script itself as a content-collection entry, so the script and the prompt that reads it are tied together at deploy time. Second, we built a replay tool that takes a call_id and walks the auditor through the call as a timeline — transcripts, consent events, tool calls, all in one view. The auditor stopped asking for screenshots. That alone was worth a week of engineering.

Takeaway

Voice AI in a regulated perimeter isn't a model problem. It's a data discipline problem. Get consent versioning, audit immutability, and replay tooling right on day one, and the rest of the build is just product. Get them wrong and you'll be rebuilding from the schema up six months in.

Insights