APPROACH

The approach to address the client’s challenge included:

  • Building re-usable data pipelines to move away from the current Hadoop based system to GCP
  • Leveraging the GCP environment to build a cost optimized ecosystem to cater to both the business users and the data-scientist community
  • Merge demographics, customer, geo-location, credit-model, clickstream & marketing campaign data to create a single source of truth
  • Use model management on GCP to track and maintain 150+ ML models

KEY BENEFITS

  • Auto-scalable infrastructure on GCP to manage variable workloads thus reducing cost
  • Big-query and data-proc used in tandem to provide compute-horsepower on a case to case basis based on cost
  • Leveraging Kubeflow and Kubernetes for model management and deploying model endpoints for down-stream consumption

RESULTS

  • The UDP since its inception have been processing 250 Tb’s of data weekly
  • Overall reduction in turn-around time by 70 percent for computationally heavy jobs
  • 30-35 per-cent overall cost savings

[wpli_login_link class='et_pb_button et_pb_button_0 et_pb_module et_pb_bg_layout_dark' text='Download this Case Study' redirect = 'https://www.tredence.com/case-study/enabled-a-petabyte-scale-ml-environment-on-gcp-for-a-leading-north-american-retailer-to-process-and-analyse-billions-of-records-and-hundreds-of-tb-of-unstructured-and-omnichannel-data-every-week?download=true']