Indonesia's largest e-commerce platform Tokopedia is continuing to build on its multi-cloud strategy in collaboration with open-source distributed SQL database company Yugabyte to modernise and migrate its database systems to a cloud-native environment.
Tokopedia’s technical architect Felix Christian told the Distributed SQL Summit Asia 2023 that the company has “successfully” migrated from a monolithic Postgres (PostgreSQL) database to a scalable distributed database with YugabyteDB, without impacting user experience or uptime.
“The distributed database can facilitate Tokopedia’s massive weights with unlimited scalability, zero downtime and fault tolerance,” he said.
Founded in 2009, Tokopedia offers online marketplace services and complementary businesses in fintech, payment, logistics, fulfilment and new retail.
It has garnered more than US$1 billion (S$1.33 billion) in investments from companies like Alibaba, SoftBank Group, Sequoia Capital, and Google.
It has also merged with Gojek, the largest courier delivery and ride-hailing service in Indonesia.
The firm serves 11 million merchants and more than 100 million active users on a daily basis. It had struggled to process millions of transactions and updates on products created.
To maintain an acceptable level of system uptime and latency, the company had to scale up, Christian said.
While most of the services were currently using the Postgres database, some teams were running on MySQL and other managed services.
“Traditional databases have limitations in scaling as they store all the data in a single node. We need a feature technology like distributed SQL that can enable us to go with a highly scalable system,” he added.
Christian said Tokopedia selected Yugabyte as it could provide fault tolerance to their systems by storing data in multiple nodes with defined replication factors if configured “correctly”.
Tokopedia has focussed on infrastructure and application modernisation as part of its cloud-focused digital transformation plans since 2017.
It adopted a multi-cloud strategy to reduce dependency on a single cloud provider and increase flexibility.
“We enhanced our strategy with active-active multi-cloud interconnects to ensure the reliability of connection between multiple cloud providers,” he explained.
Starting from a monolithic architecture, Tokopedia currently plans its application based on micro service-oriented architecture improving its productivity and cost efficiency.
“Each application provides a set of well-defined APIs that can communicate with other applications,” he added.
The retailer’s software system uses both APIs - Yugabyte Structured Query Language (YSQL), and Yugabyte Cloud Query Language (YCQL) - to meet its business needs.
YSQL is a fully-relational API that is best fit for scale-out RDBMS (relational database management system) applications that need ultra resilience, massive write scalability and geographic data distribution while YCQL is a semi-relational SQL API that is best fit for internet-scale OLTP (online transaction processing) and HTAP (hybrid transactional/analytical processing) applications needing massive data ingestion and fast queries
Christian said the company’s two critical services, handling order history and product services, require distributed database capabilities.
To achieve this, we leveraged Yugabyte across multiple zones with a replication factor, he added.
Yugabyte has eliminated the need for multiple databases, avoiding maintenance and management issues.
Additionally, some services at Tokopedia lie on NoSQL database management systems Apache Cassandra and Scylla.
“Yugabyte can cover SQL and Cassandra use cases in a single stack offering strong consistency, unlike Cassandra, which is eventual consistency,” he added.
Pointing out the advantage of Yugabyte supporting Postgres SQL features, Christian said their teams benefitted from the experience of using Postgres development experience and make easy follow-ups in Yugabyte.
“Yugabyte could improve query execution by distributing the workload across all replicas in the primary cluster without impacting the cluster itself,” he explained.
Speaking on the challenges, he said the main challenge was in determining the optimal replication factor.
“Increasing the replication factor will improve tolerance, but latency will be affected,” he warned.
We collaborated with Yugabyte teams to understand the criticality of the applications and set up the right replication factor handling latency requirements, he said.