Data Mesh: Architecture, Use Cases, and Implementation via Data Fabric

Yuval Perlov

Yuval Perlov

CTO, K2View

Data mesh is a novel approach of organizing and delivering data that is gaining ground among innovative enterprise data architects. Based on the concepts of the “business entity” and “data as a product”, an entity-based data fabric is the optimal implementation for the data mesh design pattern. This article explores the data mesh architecture, and how it’s delivered via an entity-based data fabric.

Table of Contents


Data Mesh Objectives
Data Mesh Philosophy
Implementing the Data Mesh with an Entity-Based Data Fabric
Entity-Based Data Fabric Use Cases
K2View Data Fabric: Data Mesh Inside

Data Mesh Objectives

Data mesh was designed to address 4 key issues in managing big data:

  • Data fragmented between dozens, and sometimes hundreds, of legacy and modern systems, making a single source of truth an impossibility

  • Volume and velocity of data, that data-driven enterprises must contend with

  • Lack of data democratization, where access to trusted data usually requires data engineering

  • Disconnect between data engineers, data scientists, business analysts, and operational data consumers

The simple premise of the data mesh is that business departments should be able to access and control their own data and analytics.

The thinking is, that business stakeholders in a specific domain understand their data needs better than anybody else. The justification for data mesh is that when businesspeople are forced to work with data engineers or data scientists outside their domain, provisioning the right data, to the right consumers, at the right time, is time-consuming, often inaccurate, and ultimately, ineffective.

Data Mesh Philosophy

The data mesh concept outlines 4 principles for organizing data:

  • Data as a product, where data products – comprised of clean, fresh, analytics-ready data – are delivered to any consumer, anytime, anywhere, based on permissions and roles

  • Domain-driven data ownership, which reduces the reliance on centralized data teams (data engineers and data scientists)

  • Easy access to governed and trusted data, enabled by new levels of abstraction and automation – designed to share the needed data multi-dimensionally, free of friction, and on demand (at the mesh level)

  • Distributed governance, where each domain governs its own data products, but is reliant on central control of data modeling, security policies, and compliance

In the data mesh approach, every business domain controls all aspects of its data products for analytical and operational use cases – in terms of cleanliness, freshness, privacy compliance, etc. – and for sharing them with other domains (departments across the enterprise).

Implementing the Data Mesh with an Entity-Based Data Fabric

As discussed above, a key pillar of data mesh is data as a product.

A data fabric creates an integrated layer of connected data across disparate data sources to deliver a real-time and holistic view of the business to operational and analytical workloads.

An entity-based data fabric centralizes the semantic definition of the various data products that are important to the business. It also sets up the data ingestion methods, and the needed central governance policies, that protect and secure the data, in the data products, in accordance with regulations.

Additional data fabric nodes are deployed in alignment with the business domains, providing the domains with local control of data services and pipelines to access and govern the data products for their respective data consumers.

 

data mesh architecture and the data fabricHere’s what a data mesh implementation looks like with an entity-based data fabric.

 

In this sense, an entity-based data fabric – that manages, prepares, and delivers data in the form of business entities – becomes the data mesh core.

While data mesh architecture design introduces technology and implementation challenges, these are neatly addressed via an entity-based data fabric:

Data mesh implementation challenges

How they are addressed by entity-based data fabric

Need for data integration expertise:
Domain-specific data pipelining requires distributed expertise in complex data integration and modeling of multiple disparate source systems across the enterprise.

Data products as business entities: When a data product is a business entity managed in a virtual data layer, domains don’t have to deal with the underlying source systems.

Independence vs confederacy: Striking the right balance between domain independence and reliance on central data teams isn’t trivial.

Cross-functional collaboration: Centralized data teams collaborate with domain-specific teams to produce the data products. The domain-specific teams create APIs and pipelines for their respective data consumers, govern and control access rights, and monitor usage.

Real-time and batch data delivery: Trusted data products need to be provisioned to both online and offline data consumers, efficiently and securely, on a single platform.

Operational and analytical workloads: An entity-based data fabric ingests and processes data from underlying systems, to deliver data products on demand, for operational and analytical use cases.

Use Cases of an Entity-Based Data Fabric  

An entity-based data fabric, supports multiple operational and analytical use cases, across multiple domains in the enterprise. Here are a few examples:

  • Customer 360 view, to support customer care in reducing average handle time, increase first contact resolution, and improve customer satisfaction

  • Hyper segmentation, for marketing teams to deliver the right campaign to the right customer, at the right time, and via the right channel

  • Data privacy management, to protect customer data according to data privacy regulations – such as GDPR, DDPA, and LGPD – prior to making it available to data consumers in the business domains

  • IoT device monitoring, providing product teams with insights into edge device usage patterns, to continually improve product adoption and profitability

  • Federated data preparation, enabling domains to quickly provision quality, trusted data for their data analytics workloads

K2View Data Fabric: Data Mesh Inside

The K2View data fabric is ideally suited for implementing data mesh architecture because it:

  • Integrates data, from all sources, into any number of data products, for secure distribution among any number of domains

  • Provides centralized data modeling, governance, and cataloging, while enabling self-service access to the data by domains – for both analytical and operational workloads

  • Creates a federated alliance between all company domains by providing a single, trusted, and holistic view of all business entities, for all domains

  • Epitomizes infrastructure as a platform, having paved the way for multi-dimensional data abstraction and automation, at some of the world’s largest enterprises

Get all the benefits of data mesh with K2View Data Fabric