Customers Contact TR

Hopi – Google Cloud Data Analytics Success Story


Hopi scales its data analytics capabilities and personalizes its offerings with Google Cloud Data Analytics.


Results
-Higher performance due to managed services
-Scales at speed to make real-time data analytics possible using BigQuery, Dataflow, and Google Kubernetes Engine
-Decrease in operational headcount costs
-40% cost reduction with pre-purchasing BigQuery slots
-Increase in data processing speed

ABOUT HOPI

Hopi, launched by Boyner Group in 2015, is a loyalty app that offers a personalized shopping experience using big data. It cooperates with brands from clothing to technology, from travel to food in many different sectors. It delivers campaigns suitable for its users, lifestyles, tastes, and needs and enhances the shopping experience as it is used. Hopi processes 15M real-time data on a daily basis, has a 65 TB data size, 8500 beacons, 40 different data sources, and 16.5K + POS.

THE STORY

Kartaca has established a DWH infrastructure for Hopi that reads real-time events and collects data from 3rd party companies in batches in BigQuery, from which the BI team creates dashboards for internal teams, enabling them to access time-sensitive data in real-time to support strategic decisions and to data science teams for segmentation.

Hopi data infrastructure previously hosted on on-prem systems was moved to the cloud. Following this migration, the Hopi team says that the data processing speed has increased, and the data pipeline during data transfer has become more secure and flexible, gaining massive security advantages after moving to the cloud. Data transfer processes are much more stable than with the previous vendor. While errors used to occur in data pipelines when they were on-prem, this does not happen in Google Cloud, as it indicates security vulnerabilities and provides fixes.

With Google Cloud, Hopi has dramatically improved its data analytics in terms of scale and performance and gained end-to-end visibility.


Hopi currently runs its microservices-based infrastructure using Google Kubernetes Engine, Compute Engine, and Cloud SQL and uses Pub/Sub to manage app events. Since the Hopi app provides personalized offers based on user actions within the app and user location, they need to process 15 million events every day. They state that even when they have an inrush of users or transactions, their system continues to be functional without any congestion.

Since Hopi uses Google Cloud managed services for various cases, their workload is reduced. Changes made during development are pushed to the production environment automatically - triggered with changes - through a CI/CD pipeline using Cloud Build. Events in Pub/Sub are transferred directly to BigQuery without creating a maintenance load on Hopi, making Hopi's operational work much easier. The Hopi data team can focus on more critical tasks as their operational work is much less. The team states that they are very satisfied with BigQuery performance.

The data Hopi collects builds the infrastructure for the Data Science team. Hopi Data Science team creates a segmentation based on collected data, and this segmentation is used by the marketing team for advertising. Hopi also uses monitoring and alarm features for their jobs on Dataproc, getting notified with every user right change in their project. Dataproc cluster is not up all the time; it is automated to run during certain hours for cost optimization.

As Kartaca, we have maintained cost optimization with slot reservation on BigQuery, buying slots in advance instead of on-demand to reduce the costs of nighttime jobs. We achieved a 40% cost reduction without compromising performance by making an annual slot reservation and commitment to BigQuery.

We updated the GKE version and optimized the instance types for price and performance, decreasing the resource usage in deployments and switching to instances with smaller resources (CPU, memory).

GOOGLE CLOUD PRODUCTS USED BY HOPI

BigQuery, Dataflow, Dataproc, Cloud Composer (Airflow), PubSub, BigTable, CloudSQL, GKE, CloudFunctions, Cloud Storage, Cloud Build (CI/CD pipeline), SCC

Published on: September 2022