APAC's AI Infrastructure crunch: How enterprises are figuring it out

How ready is our tech infrastructure to meet growth and energy demands? In part one last week, we examined why Asia-Pacific's AI ambitions are running headlong into a physical infrastructure crisis: power grids that can't keep pace, hardware supply chains under siege, and production failure rates that expose the gap between AI demos and AI deployments.

In this second of our two-part monthly feature, we hear further perspectives from industry players on how they are responding.

There's a growing consensus forming across APAC boardrooms: depending entirely on centralised hyperscale cloud for real-time AI inference isn't financially or operationally sustainable. The economics just don't work anymore.

The market is responding accordingly. The global edge AI market is projected to skyrocket from $11.8 billion to $57 billion by 2030, and 80 percent of CIOs expect to rely heavily on distributed edge services by 2027, according to Akamai.

Jay Jenkins, the company's Chief Technology Officer of Cloud Computing, argues the shift is structural, not cyclical.

The fundamental challenge is no longer just about securing more compute; it's about using the compute you have infinitely more efficiently.

- Jay Jenkins, Chief Technology Officer of Cloud Computing, Akamai Technologies.

"For latency-sensitive workloads, that means placing inference directly adjacent to the users, devices, and physical locations where data is generated. That is how organisations improve performance, slash bandwidth costs, and entirely avoid the round-trip delays that come with routing every single micro-interaction back to a distant cloud core," added Jenkins.

Joseph Sulistyo, Senior Vice President of Corporate Marketing at AI chip company Blaize, puts it bluntly: centralised cloud environments were built for the training boom. They fail when it comes to scaling enterprise inference.

Sumner Lemon, Senior Director of Data Centre and AI Go-To-Market, APJ, Intel argues that this requires rethinking the hardware stack entirely. "Inference requires a fundamentally different approach than training. It demands heterogeneous hardware configurations and a diverse mix of large and small language models to scale cost-effectively."

Intel's internal testing shows that offloading orchestration and data preparation to high-performance CPUs can reduce specialised GPU costs by up to 35 per cent.

OVHcloud's APAC Cloud Solutions Architect, Shiv Kumar, recommends architectures that dynamically balance serverless AI frameworks with traditional bare-metal compute, a strategy he says can cut overall IT infrastructure costs by up to 30 percent.

By strategically balancing workloads between serverless AI and traditional compute, organisations can achieve significant cost savings and operational efficiencies.

- Shiv Kumar, Cloud Solutions Architect, APAC, OVHcloud.

Wai Kit Cheah, APAC CISO and Connected Ecosystem Leader at Lumen Technologies, argues the network layer is where performance problems actually originate. "Managing rising AI compute demand is less about adding raw capacity and more about how infrastructure is designed, interconnected, and operated at scale," he says.

Performance and cost constraints often materialise in the network layer long before they hit the silicon.

- Wai Kit Cheah, CISO and Connected Ecosystem Leader, APAC, Lumen Technologies.

Data sovereignty is an architectural constraint, not a compliance checkbox

APAC isn't a monolith. It's a fragmented patchwork of jurisdictions, varying infrastructure capabilities, and increasingly nationalist data policies. That's reshaping enterprise architecture decisions in ways that pure cost optimisation can't account for.

Blaize's Sulistyo identifies a powerful "sovereignty signal" dominating enterprise conversations across the region. Indonesia and India are aggressively prioritising national oversight over data residency. The compliance layer is non-negotiable.

In APAC, layered on top of compute cost pressure is a powerful mandate for national data sovereignty.

- Joseph Sulistyo, Senior Vice President of Corporate Marketing at AI chip company Blaize.

"Governments and enterprises aren't just worried about operational margins, they are deeply protective of control. Where does the data live? Who has access to it? You cannot simply route everything through a centralised US hyperscaler and call the problem solved," Sulistyo.

Ben Tulloch, Executive Managing Director, Advisory Services, APAC, NTT DATA, warns that enterprises have a narrow window to get their architectural commitments right. NTT DATA projects sovereign cloud adoption across APAC will surge by 50 percent over the next two years as organisations scramble to insulate themselves from geopolitical risk and shifting cross-border data laws.

Organisations must make deliberate architectural choices up front - whether public, private, sovereign, or hybrid - because these decisions lock in their cost structures, governance frameworks, and operational flexibility for years.

- Ben Tulloch, Executive Managing Director, Advisory Services, APAC, NTT DATA.

The ROI reckoning: AI's honeymoon is officially over

Boards are done with patience. Enterprise AI budgets are under a microscope, and the infrastructure imbalances are making the financial picture murkier.

Simon Rizkalla, New Relic's Vice President of Customer Advocacy for Asia-Pacific and Japan, points to a hidden tax paid by regional enterprises: during periods of peak global demand, APAC traffic is frequently deprioritised when routed through US and European data centres, translating directly to higher latency and degraded model performance for end users.

Meanwhile, enterprises are scaling AI workloads faster than their ability to track what they actually cost. The telemetry that traditional IT monitoring tools generate was never designed to capture LLM-specific metrics like token quality or structural costs.

You cannot control what you cannot measure.

- Simon Rizkalla, Vice President of Customer Advocacy for Asia-Pacific and Japan, New Relic.

Rizkalla advocates for integrating a real-time financial lens directly into the AI engineering stack, tracking exactly how token usage translates to actual spend. New Relic cut its own internal cloud production costs by 60 percent per gigabyte by implementing this approach.

Datadog's Narayana echoes the call, pushing for unified dashboards that correlate cost metrics and performance KPIs. “The goal is to isolate which business services actually justify their GPU allocations.”

NTT DATA’s Tulloch makes the stakes plain. "Compute resources must be explicitly rationed and allocated to use cases with proven business impact, regulatory clarity, and strict financial accountability. This forces an aggressive, deliberate corporate distinction between AI initiatives that actively earn their compute and speculative projects that fail to justify the baseline cost."

OpenAI's Jay points to BCG research showing committed AI leaders achieved 1.7x higher revenue growth and 3.6x greater total shareholder return over three years compared to laggards, but the returns only materialise when employees move well beyond basic prompting into deeply integrated workflows.

"The greatest enterprise returns occur when AI is embedded into core business workflows across entire teams," added Jay.

The hidden efficiency drain hiding in plain sight

Here's the twist: a significant portion of the apparent compute shortage may actually be a data architecture problem in disguise.

Remus Lim, Senior Vice President for Asia Pacific and Japan at Cloudera, surfaces a stark paradox. While 85 percent of APAC organisations claim clear visibility over their data estates, 38 percent admit they can't actually use that data effectively.

Fragmented data architectures force AI systems to repeatedly reconcile duplicate information across disconnected silos, spiking compute usage without improving model outcomes.

"What appears to be a massive surge in AI demand or a severe shortage of compute capacity is frequently just a reflection of deep systemic inefficiencies in underlying data pipelines," Lim says.

Organisations are burning expensive compute to compensate for disconnected data architectures, rather than generating real business value.

- Remus Lim, Senior Vice President, Asia Pacific and Japan, Cloudera.

NTT DATA's Tulloch adds a warning for engineering teams considering the shortcut: running advanced models on unoptimised pipelines produces expensive, incorrect outputs faster. His prescription — and Cloudera Lim's — is federated data architecture, allowing AI models to query enterprise data securely where it lives rather than migrating petabytes into centralised cloud repositories.

The nuclear option, and why hyperscalers aren't decentralising

Not everyone is betting on edge distribution. At the top of the market, a well-capitalised counter-narrative is taking shape: hyper-centralisation at a scale that sidesteps conventional constraints entirely.

Every major cloud hyperscaler has signed at least one nuclear energy procurement agreement to backstop their AI data infrastructure.

More than 25-30 percent of all incremental data centre megawatts deployed through 2030 will utilise behind-the-meter power generation that completely bypasses the public grid, by building facilities directly adjacent to natural gas generation plants.

- Mandeep Singh, Global Head of Technology Research, Bloomberg Intelligence.

Bloomberg Intelligence's Global Head of Technology Research, Mandeep Singh notes that the rate limits currently frustrating enterprise Anthropic users are enforced at a global corporate level, meaning large-scale funding rounds and compute partnerships can rapidly shift capacity across the ecosystem overnight. The infrastructure war isn't over; it's barely started.

The playbook for APAC enterprises

The through-line across every conversation in this space is the same: treating AI as a standard software procurement exercise is a path to operational failure. The enterprises that navigate the infrastructure crunch will be those that build for it from day one.

Firstly, this means model-agnostic architectures with abstraction layers that failover automatically between OpenAI, Anthropic, and open-source alternatives like Llama when rate limits hit.

Secondly, it means shifting cost awareness left in the development lifecycle, evaluating token efficiency before writing production code, not after the cloud bill arrives.

Thirdly it means deploying observability platforms that treat real-time spend as a live operational KPI, not a monthly accounting exercise. Lastly, it also requires fixing data architectures before reaching for more GPUs.

The enterprises (and nations) that treat infrastructure as a core strategic asset from here forward are the ones that will still be running production AI in five years. The rest are one capacity crunch away from a very expensive lesson.

APAC's AI Infrastructure crunch: How enterprises are figuring it out

The cloud-only model is breaking down, and the edge is filling the gap.

The fundamental challenge is no longer just about securing more compute; it's about using the compute you have infinitely more efficiently.

By strategically balancing workloads between serverless AI and traditional compute, organisations can achieve significant cost savings and operational efficiencies.

Performance and cost constraints often materialise in the network layer long before they hit the silicon.

In APAC, layered on top of compute cost pressure is a powerful mandate for national data sovereignty.

Organisations must make deliberate architectural choices up front - whether public, private, sovereign, or hybrid - because these decisions lock in their cost structures, governance frameworks, and operational flexibility for years.

You cannot control what you cannot measure.

Organisations are burning expensive compute to compensate for disconnected data architectures, rather than generating real business value.

More than 25-30 percent of all incremental data centre megawatts deployed through 2030 will utilise behind-the-meter power generation that completely bypasses the public grid, by building facilities directly adjacent to natural gas generation plants.

Most Read Articles

SAP: Measuring AI ROI the old way no longer works and can be misleading

Hygiene, endpoint visibility and operational discipline are critical prerequisites for AI readiness

Why organisations should treat agentic AI as an infrastructure transformation

APAC's AI infrastructure crunch: The roadblock is more than building capacity