DailyHunt is Verse's flagship News App with over 300 Million Daily Active Users (DAU) and its complete infrastructure was running on-premised over a multi hypervisor environment composed of baremetal, VMWare & Nutanix. The data pipeline consisted of Hadoop, Kafka, Storm, HBase, KStream & Redis that struggled to scale with growing data volumes. Seeking enhanced scalability, flexibility, and cost-efficiency, the client approached Quark Media with the problem statement.
The on-premises data pipeline faced several challenges
Frequent downtimes of the data pipeline, which was impacting business.
Limited scalability, hindering the processing of increasing data loads.
High maintenance costs associated with hardware upgrades and maintenance.
Lack of flexibility in adapting to evolving technology and business needs.
The client aimed to achieve the following objectives with the migration
Seamless transition from on-premises to cloud infrastructure.
Improved scalability to handle growing data volumes.
Better uptimes & stability of the data pipeline platform.
Enhanced flexibility and agility to adapt to changing business requirements.
After careful evaluation on performance, efficiency, scale and cost, we recommended the migration to Google cloud platform (GCP) for its robust infrastructure, comprehensive set of cloud services, reliable data store, BigData and competitive cost.
We recommended tools like StratoZone to do current infrastructure assessment, which can automatically discover existing infrastructure from any environment, analyze the cost-benefits of public cloud, and plan the migration.
Utilizing GCP migration tools like DMS, Storage Transfer Service (STS), RIOT for the Redis data migration, MirrorMaker for Kafka data replication we devised a phased approach for migrating databases and data repositories to the cloud, ensuring minimal downtime.
Leveraging infrastructure as code principles, we used tools like Terraform/Ansible to define and provision cloud infrastructure, enabling efficient replication of on-premises setups in the cloud.
We containerized existing applications using Docker and orchestrated them with Google Kubernetes Engine (GKE), ensuring consistency across on-premises and cloud environments.
Introduced continuous integration and continuous deployment (CI/CD) practices to automate testing and deployment processes, ensuring reliability and speed.
A comprehensive analysis of existing infrastructure, data dependencies, and application requirements was conducted to formulate a detailed migration plan.
Using GCP DMS, data was migrated with minimal downtime, and data integrity was rigorously maintained throughout the process.
Applications were optimized for cloud architecture, taking advantage of cloud-native services and ensuring efficient resource utilization.
Rigorous testing, including performance testing and data validation, was conducted at each migration phase to identify and rectify issues promptly.
The migration delivered significant positive outcomes
The cloud-based data pipeline effortlessly scales to handle increased data loads, ensuring business continuity. Happy analyst as data can be queried and visualized anytime is alway available.
The client now enjoys increased flexibility, easily adapting the data pipeline to evolving business needs.
More traffic can be served with the reduced infra and better performance.
The migration led to optimized resource utilization, resulting in substantial cost savings compared to the on-premises infrastructure.