Skip to main content

Kubrick enabled a major FMCG retail team to revolutionise their data access for enhanced analytics and machine learning capabilities with Databricks Unity Catalog.

Consultants

5

ETL Pipelines Migrated

65

Data Discovery time reduction

99%
01

KubrickImplementing Databricks Unity Catalog

The challenge

  • The retail team at a major FMCG organisation sought to enhance their analytics, starting with improving the access control and data sharing capabilities from their Databricks Lakehouse. To utilise the latest in Databricks functionality (and most features moving forward), they looked to implement Databricks' Unity Catalog.
  • The ongoing issues included a 2-week wait time to approve data sharing, with 2 hours wasted reacting to data access requests.​
  • As a relatively new feature, the inhouse team did not have the required knowledge - or time around their BAU resonsibilities - to facilitate the transition to Unity Catalog themselves.​
  • Moreover, the team lacked documentation about their existing processes, data landscape, or downstream use cases, to provide a foundation of knowledge to support the migration.​
02

The solution

  • Kubrick deployed a squad of 5 consultants to handle the creation of Unity Catalog within a new Databricks Workspace. They integrated closely with the client's inhouse data engineering team to implement widespread code, linked service and cluster configuration changes to minimise disruption to the daily scheduled Azure Data Factory pipelines.​
  • They collaborated with the client's cloud admins to implement fine-grained access control, appropriate cloud storage access permissions and sharing knowledge on how to interact with UC collateral.​
  • Kubrick also facilitated the necessary Unity Catalog Training alongside comprehensive handover documentation, including Data Dictionary, training videos, and further referenceable documentation for future workforce onboarding and development.
03

The results

  • Migration of 175 delta tables to Unity Catalog in each environment with defined access controls, delivered into the production environment.​
  • Enabled access control of PII sensitive data assets.​
  • Reduced data discovery time by 99% using Delta Sharing Capability​.
  • Enabled new ML use-cases using DatabricksIQ and LakeView​.

Similar case studies