South African businesses are rolling out AI voice agents at speed, but most were built in the United States — sounding American, “thinking” in English and processing calls through servers on the other side of the world. Researchers at the University of Cape Town and Cape Town startup Untapped AI argue the problem runs deeper than vendors are willing to admit.
Voice is still the dominant customer service channel in South Africa, and when it breaks, customers hang up. Cape Town startup Untapped AI began investigating the issue 18 months ago. The company owns a call centre with more than a decade of operational data, and that data pointed to a specific pattern — drop-off rates climb when customers hear robotic or foreign-sounding voices.
“Our data indicates that the drop-off rate is significantly higher when customers encounter robotic or foreign accents,” said Lloyd Matthew, CEO of Untapped AI. The company recorded 20 South African voice agents in a studio rather than generating them synthetically. Matthew said the difference matters because South Africans use slang and expressions that international platforms do not carry. Untapped AI currently supports English and Afrikaans, with isiZulu and isiXhosa in development. For now, when a caller switches to an African language, the agent asks them to continue in English or Afrikaans.
Accent is only part of what makes voice AI fail locally. Bruce von Maltitz, CEO of South African contact centre and hosted telephony provider 1Stream, argues that most businesses are treating AI as a software problem when it is actually a telephony engineering problem. The threshold that matters, von Maltitz said, is around 400 milliseconds — respond faster than 300ms and the bot sounds abrasive; slower than 700ms and the customer realizes they are talking to a machine that cannot keep up.
“You can start an AI business from scratch and be proficient at software, but delivering a high-quality voice experience requires a deep understanding of legacy telephony, routing and local infrastructure,” von Maltitz wrote in an analysis shared with TechCentral. The common failure mode, he said, is the “bolted-on” approach — companies layering AI on top of existing telephony without accounting for the latency that accumulates at each step. In South Africa, where calls often route through international servers, that latency compounds.
Untapped AI says local hosting is central to its offering. Matthew said the company routes calls through South African data centres and holds ISO 27001 certification through its own voice platform. “By us being in control of the full stack, all the way from GPUs to the delivered AI voice agent, we can provide the necessary assurances to ensure data protection,” Matthew said.
Accent and latency are solvable engineering problems. The third challenge — language — is not.
UCT’s department of computer science published research this month introducing MzansiLM, which it describes as the first publicly available AI language model trained on all 11 of South Africa’s official written languages. The team has made the model freely available for researchers and developers. “In language modelling, languages are considered low-resource primarily because there are much fewer and smaller textual datasets available in these languages for training language models,” said Jan Buys, a senior lecturer at UCT and one of the project’s lead researchers.
Nine of South Africa’s 11 official languages fall into that low-resource category — and the gap runs deeper still, with isiNdebele and Sepedi remaining severely underrepresented even within MzansiLM itself. Asked what happens when a South African customer speaks isiZulu or Sepedi to a deployed AI system, Buys said some larger models have limited isiZulu support, while another common approach is to translate the query into English and process it there. He could not say definitively what businesses are doing in practice — and that uncertainty, he said, is the point.
Buys said enterprises buying tools that claim to be “built for South Africa” should press vendors on two things — what language support actually means and how it can be verified, and whether the underlying model understands South African context, regulation and business environments, or whether it is simply an international model with a local label. “The more transparency there is, the better one is able to assess these things,” he said.
He also raised a risk most enterprise discussions do not reach. As AI voice localises further, it becomes a more effective fraud tool — a voice agent that sounds authentically South African could target elderly consumers who would otherwise recognize a foreign-sounding call as suspicious. “There are broader societal and regulatory issues that one would have to think through as they come up,” Buys said.
Untapped AI said more than 5,000 of its agents are currently live — a figure Matthew said includes deployments within the group’s own companies and call centres it already services. He expects Microsoft and Amazon to localize properly for South Africa within 12 to 18 months, and is betting that market share built before then will be hard to displace.
Buys takes a longer view. MzansiLM, he said, is not a product but evidence that South Africa needs to build its own AI capacity rather than depend on the willingness of large American companies to support local languages. “We shouldn’t only have to rely on the support that some big US company can provide,” he said.
The MzansiLM model has 125 million parameters — small by commercial standards — and Buys acknowledged that larger models will still outperform it in most practical applications today. But the question for enterprise buyers, he said, is not whether the research outperforms the commercial alternatives. It is whether the commercial tools South African businesses are already paying for are doing any better.





