As AI moves to production, enterprises must confront limits of current stacks

As AI adoption in Asia-Pacific moves from pilot projects to production, enterprise data systems are under pressure to adapt. Traditional stacks that are built by stitching together separate vector databases, search tools, and inference engines, often break down at scale, especially in multilingual and multi-region environments.

These fragmented setups add latency, duplicate data, and increase operational overhead. To solve this, CIOs are turning to composable AI architectures, i.e., modular stacks that integrate search, storage, and inference without sacrificing scalability.

A key design question now emerging: Should vector search sit inside the transactional database or live in a dedicated system?

MongoDB’s vice president and field CTO, Boris Bialek, told iTnews Asia that many teams are getting this balance wrong.

“Problems start when you try to run high-speed transactional workloads and vector search in the same system,” Bialek said. “Every time a new transaction happens, it updates the vector index too, and that slows everything down.”

AI architectures must not break in production

What works in a demo often breaks under real-world load. In multilingual, multi-region environments like APAC, rushed architectural choices quickly expose limits.

A common misstep is embedding vector search directly into the transactional database, said Bialek.

While this keeps everything in one place, it often leads to performance degradation.

“Many so-called 'native' vector features are just blobs (binary large objects) behind the scenes. When high-speed transactions run alongside compute-heavy vector queries, both slow down,” said Bialek.

In response, teams start splitting systems, duplicating data, and syncing changes through Kafka or ETL pipelines.

“It becomes what I call ‘management by Nike’ - everyone’s running between systems trying to keep them in sync. What started as a simple idea ends up as a fragmented setup that’s hard to scale,” he added.

Another alternative of adding a separate vector database, can also backfire.

It introduces glue code, near-real-time sync jobs, and risks of stale or inconsistent data.

Once you start duplicating vectors and managing sync jobs, you’ve lost the simplicity you were aiming for.

- Boris Bialek, VP and Field CTO, MongoDB

Instead, Bialek recommends a composable architecture, where modular systems are natively integrated into a unified stack.

In MongoDB’s case, that includes an operational database, a dedicated vector search layer, and built-in text search, coordinated internally, without external pipelines or duplication.

Such architecture eliminates friction and allows the engineering teams to build reliable, production-ready AI systems.

However, as CIOs modernise AI stacks, many still face strategic concerns, particularly around over-consolidation and the risk of vendor lock-in.

Avoid lock-in through openness and flexibility

Talking on the concern, Boris Bialek suggests reframing the discussion, not as risk management, but as a question of flexibility and long-term value.

“It's not about being locked in or out, it's about being able to adapt as needs evolve,” said Bialek.

Modern data architecture built on open standards, such as the JSON document model, allows organisations to move components in or out as needed.

In MongoDB’s case, the use of non-proprietary formats and interoperable components means teams can integrate open-source tools, extract modules, or migrate workloads without being tightly bound to a single vendor ecosystem.

This openness is essential as enterprises now expect not just functionality, but continuous innovation, operational simplicity, and scalable systems without added complexity.

However, meeting expectations isn’t just about architecture in theory; it’s about how systems perform under real-world conditions.

Lessons from real-world AI deployments

In multilingual, multi-regulatory environments like Southeast Asia, India or Europe, the ability to localise data, models, and inference workflows becomes essential.

Bialek mentions that ASEAN and India are similar to Europe in terms of cultural attitudes, different app usage patterns, and infrastructure challenges.

MongoDB’s document model supports type stability, applies schema where needed, and maintains consistent behaviour across languages.

This flexibility enables enterprises to build multilingual, domain-specific applications without adding operational burden.

Bialek said two factors that are critical in these environments include scalability and deployment flexibility.

“A major retail group based in Bangkok, for example, runs sharded clusters across Singapore, Kuala Lumpur, Jakarta, and Bangkok. Each region handles local writes and enforces data sovereignty, while the system maintains a unified customer view,” said Bialek.

This setup lets the business recognise a customer across countries, including Thailand and Malaysia, without disrupting service.

In India, banks deploy across Mumbai, Bangalore, and Hyderabad to support local writes and global reads. Even if one region goes offline, MongoDB’s architecture keeps operations running; no custom routing or failover tools are required.

Bialek mentions that non-functional requirements like high availability, encryption, key rotation, and vector scalability become critical.

These capabilities often get overlooked but are essential for long-term performance, compliance, and enterprise trust.

As enterprises scale AI beyond pilots, foundational capabilities like scalability and security become essential for delivering production-ready systems that meet both technical and business needs.

What production-ready AI requires

In ASEAN and similar regions, many organisations still experiment with AI, often prompted by boardroom directives to adopt a formal strategy.

Bialek said there is a growing transition toward structured, business-led implementations.

AI adoption today aligns closely with tangible business goals, like logistics optimisation, personalised customer experiences, and operational efficiency.

Business and technical leaders now work together, moving AI from exploratory phases into real-world production.

Despite such successes, Bialek mentions a major bottleneck: moving from prototype to production, as promising AI projects falter due to the absence of scalable infrastructure.

He emphasises the importance of AI-specific CI/CD pipelines that ensure data traceability, compliance, and governance, elements that are often overlooked in early-stage experimentation.

As full-stack RAG deployments begin to enter production across the region, Bialek sees signs of growing enterprise maturity.

However, he cautions that long-term success requires strong delivery pipelines and tight alignment between business priorities and technical execution.

Understand your priorities before rethinking the AI stack

As enterprises scale AI, the need for real-time context to reduce LLM hallucinations, especially in critical use cases like fraud detection, is essential, Bialek said.

Embedding live metadata such as payer, payee, and location helps ground model outputs in accurate, actionable data.

An effective AI stack should support hybrid search, combining vector and text search within a unified system.

Bialek says MongoDB’s integration with Voyage AI delivers real-time embeddings and retrieval without relying on external pipelines or complex system sprawl.

To future-proof AI architecture, enterprises need to prioritise real-time processing, unified data access, and simplified infrastructure.

They should avoid siloed systems and adopt composable platforms that strike a balance between flexibility and performance.