Data mesh, a decentralized data management architecture, relies on 4 principles:
Data products, domain ownership, instant access, and federated governance.
The thinking is, that business stakeholders in a specific domain understand their data needs better than anybody else. And when business people are forced to work with data engineers or data scientists outside their domain, provisioning the right data, to the right data consumers, at the right time, is time-consuming, often error-prone, and ultimately, ineffective.
In a data mesh, every business domain retains control over all aspects of its data products for both analytical and operational use cases – in terms of quality, freshness, privacy compliance, etc. – and is responsible for sharing them with other domains (departments in the enterprise).
Data products are produced to be consumed with a specific purpose in mind. A data product may assume a variety of forms, based on the specific business domain or use case to be addressed.
A data product will often correspond to a dataset of one or more business entities – such as customer, asset, supplier, order, credit card, campaign, etc. – that data consumers would like to access for analytical and operational workloads. The data for will typically originate in dozens of siloed source systems, often of different technologies, structures, formats, and terminologies.
The data product delivery lifecycle adheres to the agile principles of being short and iterative, to deliver quick, incremental value to data consumers. A data product approach entails:
Decentralization
With the meteoric rise of cloud-based applications, application architectures are transitioning away from centralized IT, towards distributed, data services (or a service mesh).
Data architecture is following the same trend, with data being distributed across a wide range of physical sites, spanning many locations (or a data mesh). Although a monolithic, centralized data architecture is often simpler to create and maintain, in an IT world propelling to the cloud, there are many good reasons and benefits to having a modular, decentralized data management system.
Data mesh represents a decentralized way of distributing data across virtual and physical networks. Where legacy data integration tools require a highly centralized infrastructure, a data mesh operates across on-premise, single-cloud, multi-cloud and edge environments.
Distributed security
When data is highly distributed and decentralized, security plays a critical role. Distributed systems must delegate authentication and authorization activities out to a host of different users, with different levels of access. Key data mesh security capabilities include:
Data product mindset
Innovative data product practices combine the concepts of "design thinking", for breaking down the organizational silos that often impede cross-functional innovation, and the "jobs to be done" theory, which defines the product’s ultimate purpose in fulfilling specific data consumer goals.
Exchange data products between data producers and data consumers
Simplify the way data is processed, organized, and governed
Democratize data with a self-service approach that minimizes dependence on IT
Traditional data management platforms |
Data mesh architectures |
Serve a centralized data team that supports multiple domains |
Serve autonomous domain teams |
Manage code, data, and policies, as a single unit |
Manage code and pipelines independently |
Require separate stacks for operational and analytical workloads |
Provide a single platform for operational and analytic workloads |
Cater to IT, with little regard for Business |
Cater to IT and Business, alike |
Centralize the platform for optimized control |
Decentralize the platform for optimized scale |
Force domain awareness |
Remain domain-agnostic |
The left-hand-side of the table describes most monolithic data platforms. They serve a centralized IT team, and are optimized for control. Operational stacks used to run enterprise software, are completely separated from the clusters managing the analytical data.
The data mesh dictates greater autonomy in the management of data flows, data pipelines, and policies. In the end of the day, data mesh is an architecture based on decentralized thinking that can be applied to any domain.
Here are the key considerations:
Data mesh supports many different operational and analytical use cases, across multiple domains. Here are a few examples:
A data product platform creates and delivers data products of connected data from disparate sources to provide a real-time and holistic view of the business to operational and analytical workloads.
A real-time data product platform creates the semantic definition of the various data products that are important to the business. It also sets up the data ingestion methods, and the needed central governance policies, that protect and secure the data in the data products, in accordance with regulations.
Additional platform nodes are deployed in alignment with the business domains, providing the domains with local control of data services and pipelines to access and govern the data products for their respective data consumers.
Here’s what a data mesh implementation looks like based on a real-time data product platform.
In this sense, a data product platform – that manages, prepares, and delivers data in the form of business entities – becomes the data mesh core.
While data mesh architecture introduces technology and implementation challenges, these are neatly addressed with a data product platform:
Data mesh implementation challenges |
How they are addressed by a data product platform |
Need for data integration expertise |
Data products as business entities |
Independence vs confederacy |
Cross-functional collaboration |
Real-time and batch data delivery |
Operational and analytical workloads |
The K2view Data Product Platform is ideally suited for implementing data mesh architecture because it:
Integrates data, from all sources, into any number of data products, for secure distribution among any number of domains
Provides centralized data modeling, governance, and cataloging, while enabling self-service access to the data by domains – for both analytical and operational workloads
Creates a federated alliance between all company domains by providing a single, trusted, and holistic view of all business entities, for all domains
Epitomizes infrastructure as a platform, having paved the way for multi-dimensional data abstraction and automation, at some of the world’s largest enterprises
Get a live demo of the K2View platform to assess its fit for your use cases.
Experience the power and flexibility of the K2View platform with a 30-day trial.
Experience the power and flexibility of the K2View platform with a 30-day trial.