
Pete Hanlon, CTO of Moneypenny. Moneypenny handles outsourced telephone calls, stay chat and digital comms for 1000’s of firms globally.
Each firm deploying an AI voice agent has made the identical guess, whether or not they notice it or not. They’ve let the mannequin design their buyer dialog.
The targets are clear: acquire the caller’s title, perceive their enquiry and route them to the fitting group. The agent achieves all three. Metrics look wholesome. However ask anybody within the group to clarify how the dialog truly unfolds—why the AI asks what it asks, in what order, with what tone and what occurs when the caller says one thing surprising—and no person can let you know. As a result of no person determined. The mannequin did.
The Shortcut Most Firms Take
This isn’t a bug; it’s a shortcut. It is giving a big language mannequin (LLM) a set of targets and letting it determine how one can get there. That is quick to construct, simple to demo and avoids the genuinely arduous work of designing the dialog. The LLM decides the order of questions, the phrasing and the restoration when one thing goes unsuitable. It improvises your buyer expertise in actual time.
The result’s an organizational hole that just about no person talks about. Product groups outline what the agent ought to obtain and engineering groups construct the techniques, however the precise design of the dialog—the construction, tone, sequencing and restoration logic that sits between the aim and the language mannequin—belongs to neither. The mannequin has stuffed the vacuum by default.
Each different customer-facing self-discipline has devoted possession. Model owns visible id, product owns the interface and advertising and marketing owns messaging. However the dialog, the factor the client experiences most straight, has been left to improvisation.
Why Locking It Down Doesn’t Work Both
The apparent response is to lock all of it down—outline each query, department and response. However that creates the alternative drawback. You find yourself with a inflexible script that seems like a traditional interactive voice response (IVR) menu with higher grammar: “Press 1 for gross sales, press 2 for help…”
The mannequin provides no actual worth since you’ve constrained it to the purpose the place it might probably’t do what it’s truly good at, which is making conversations really feel pure, dealing with the surprising and responding like a human being relatively than a flowchart.
The Dialog Management Layer
The actual problem is discovering the candy spot between these two extremes. An excessive amount of flexibility and you’ll’t assure the agent has finished its job. Too little and also you’ve constructed costly automation that callers hate. What’s wanted is what I’d name the dialog management layer.
The businesses getting this proper are those who’ve truly constructed one, understanding which components of the dialog have to be managed and which might be left to the mannequin. Working with buyer conversations over a few years has formed a transparent view for me of the place construction issues and the place flexibility creates worth. That have has knowledgeable how we should always take into consideration discovering the steadiness between management and pure dialog.
What Ought to Be Managed (And What Shouldn’t)
Right here’s how I imagine you must take into consideration that boundary. The issues that should be deterministic are the issues that should be auditable. Did the agent acquire the data it was supposed to gather? Did it comply with the proper routing logic? Did it keep inside the guardrails? These will not be questions you need answered by likelihood. They should be ruled by code that’s testable, predictable and auditable in actual time.
The mannequin’s job is all the pieces else: taking a caller’s stumbling clarification and understanding what they really want, responding in a tone that matches the second and dealing with the surprising phrasing, the half-finished sentence and the caller who adjustments their thoughts mid-thought. That’s the place language fashions are genuinely good.
However right here’s what most groups miss. The construction doesn’t simply constrain the mannequin; it improves it. An LLM that is aware of the place it’s in a dialog (what’s already been lined and what nonetheless must occur) produces sharper, extra related responses than one which’s improvising towards an open aim. Grounding the mannequin in an outlined circulation means it might probably focus solely on language as a substitute of splitting effort between deciding what to do and how one can say it.
With out a management layer, you get brokers that sound pure however can’t be audited/trusted. We’ve seen brokers affirm appointments earlier than gathering contact particulars, supply refunds that don’t exist and cheerfully route callers to groups that closed hours in the past. No one can diagnose why it went unsuitable as a result of no person designed the circulation. You possibly can’t high quality guarantee a dialog that’s totally different each time.
Getting It Proper
The businesses which can be profitable with voice AI aren’t those with the most effective fashions and even probably the most highly effective ones. They’re those which have stopped outsourcing the dialog to the mannequin and began designing it intentionally. They’ve discovered the steadiness between management and freedom, they design the components that matter for compliance and auditability, they usually free the mannequin to do what no script ever may.
This implies drawing a transparent boundary between what the system controls and what the mannequin generates. The construction (sequencing, routing, guardrails and enterprise logic) of my firm’s AI, for instance, is deterministic and designed. When the LLM operates inside that construction, it might probably carry pure language flexibility with out improvising alongside the way in which.
It’s important to have the ability to show, in actual time, that the agent has finished what it was imagined to do with out the caller ever feeling like they’re speaking to a system. Once you design the dialog intentionally, you don’t lose flexibility. You acquire management, consistency and belief.
Ask your AI vendor one query. The place is the boundary between the mannequin and the code? If they’ll’t present you what’s deterministic and what’s generated—if they’ll’t show the agent accomplished each required step in a dialog—you haven’t automated your buyer expertise. You’ve outsourced it.
Forbes Technology Council is an invitation-only neighborhood for world-class CIOs, CTOs and expertise executives. Do I qualify?






:max_bytes(150000):strip_icc()/HDC-GettyImages-668641904-9179dc9fe60446d8b4d8a08fbffcf46d.jpg?w=600&resize=600,400&ssl=1)



Recent Comments