Cookie Notice
This site uses cookies for performance, analytics, personalization and advertising purposes.
For more information about how we use cookies please see our Cookie Policy.
Manage Consent Preferences
These cookies are essential in order to enable you to move around the website and use its features, such as accessing secure areas of the website.
These are analytics cookies that allow us to collect information about how visitors use a website, for instance which pages visitors go to most often, and if they get error messages from web pages. This helps us to improve the way the website works and allows us to test different ideas on the site.
These cookies allow our website to properly function and in particular will allow you to use its more personal features.
These cookies are used by third parties to build a profile of your interests and show you relevant adverts on other sites. You should check the relevant third party website for more information and how to opt out, as described below.
Data Mesh – an approach founded by Zhamak Dehghani – refers to a decentralized, distributed approach to enterprise data management. It is a holistic concept that sees different datasets as distributed products, orientated around domains. The idea is that each domain-specific dataset has its own embedded engineers and product owners to manage that data and its availability to other teams, driving a level of data ownership and responsibility, which is often lacking in the current data platforms that are largely centralized, monolithic, and often built around complex pipelines.
Last Updated: December 9, 2023 | Authors: Adrian Estala, Andy Mott
Data Mesh is a strategic approach to strengthen an organization’s digital transformation journey as it centers on serving up valuable and secure data products. Data Mesh evolves beyond the traditional, monolithic, and centralized data management methods of utilizing data warehouses and data lakes.
Data Mesh improves organizational agility by empowering data producers and data consumers with the accessibility to access and manage big data, without the trouble of delegating to the data lake or data warehouse team. A solution for data silos and data integration, data mesh allocates data ownership to domain-oriented groups or business units that serve, own, and manage data as a product. All of which improves data-driven decision-making for data leaders.
To understand what domain-driven data is, we must know what a domain is. A domain is an aggregation of people organized around a common functional business purpose.
Data Mesh proposes that domain ownership is responsible for management of the data, metadata, policies and created by the business function of the domain. The domains are responsible for the assimilation, transformation, and provision of data to the end-users. Eventually, the domain exposes its data as data products, whose entire lifecycle is owned by that domain.
Data products are produced by the domain and consumed by downstream domains or users to create business value. Data products are different from traditional data marts, as they are self-contained, and are in themselves responsible for aspects such as security, provenance and infrastructure concerns related to ensuring that the data is kept up to date. Data products enable a clear line of ownership and responsibility and can be consumed by other data products or by end consumers directly to support business intelligence and machine learning activities.
Related reading: Data products & data mesh: how they relate, how they are different & data product blogs
Related webinar: Empowering modern analytics strategies with data products
Related whitepaper: A guide to data products: creating and managing reusable data assets
The concept of a self-serve data infrastructure is that it is made up of numerous capabilities that can be easily used by members of the domains to create and manage their data products. The self-serve data platform is supported by an infrastructure engineering team, whose primary concern is the management and operation of the various technologies in use. This illustrates the separation of concerns, domains are concerned with data and the self-serve data platform team is concerned with technology. The measure of success of the self-serve data platform is the autonomy of the domains.
Traditional data governance and access controls can be seen as an inhibitor to producing value through data. Data Mesh enables a different approach by embedding governance concerns into the workflow of the domains. There are numerous aspects to data governance, however when considering Data Mesh, it is imperative that usage metrics and reporting become part of this definition. Data sharing, usage and how that data is being used are key data points to understanding the value and hence success of individual data products.
The implementation of Data Mesh promotes organizational agility for organizations who want to thrive in an uncertain economic climate. All organizations need to be able to respond to changes in their environment with a low-cost, high reward approach. Introducing new data sources, needing to comply with changing regulatory requirements or meeting new analytics requirements are all drivers that will precipitate changes to an organization’s data management activities. Current data management approaches are typically based on complex and heavily integrated data pipelines (ETL, ELT) and data ingestion between operational and analytical data systems struggling to change in time to support the business needs in a timely fashion in the face of these drivers. The purpose of Data Mesh is to provide a more resilient approach with respect to data to efficiently respond to these changes.
Related reading: 10 benefits and challenges of data mesh
Data Mesh is a ‘socio-technical’ approach that requires changes to the organization across all three dimensions of people, process and technology. Organizations that adopt Data Mesh may spend 70% of their efforts on people and processes and 30% on the technology to enable the future Data Mesh state.
Embarking on a Data Mesh journey will result in significant organizational changes and adjustments to employees’ roles. Existing workers will be critical to the success of adopting a Data Mesh, as they have invaluable tacit knowledge to contribute to the Data Mesh journey. Therefore, the transition of data ownership from a central data team to decentralized domain-driven design should be approached as well as a realignment of existing data-focused employees. There are also changes to management hierarchies and also reward mechanisms.
To promote a sustainable and agile data architecture, implementing Data Mesh will require process changes within the organization. If we consider data governance, new processes around data policy definition, implementation and enforcement will be required which will impact the process of accessing and managing data, as well as the processes pertaining to exploiting that data as part of business-as-usual(BAU) business processes.
Technology capabilities are a key enabler to implement and operate a Data Mesh. New technology is likely to be required for a number of reasons:
The truth is that Data Mesh may not be the correct fit for every organization. Data Mesh is primarily aimed at larger organizations that encounter uncertainty and change in their operations and environment. If your organization is small with respect to its data needs and those data needs don’t change over time, then Data Mesh is probably an unnecessary overhead.
Learn all about strategy, implementation, and execution of Data Mesh first hand from Zhamak Dehghani.
The data lake is a technology approach, whose main objective has traditionally been as a single repository to move data to in as simple a manner as possible, where the central team is responsible for managing it.
Sure, data lakes provide significant business value with raw, and open file formats and reduce storage costs. They also suffer from a number of concerns with the primary issue is that once data is moved to the lake, it loses context. For example, we may have many files containing a definition of customer, one from a logistics system, one from payments and one from marketing, which one is correct for real-time data analysis?
Furthermore data in the data lake will not have been pre-processed, so data issues will inevitably arise. The data consumer will then typically have to liaise with the data lake team to understand and resolve data issues, which becomes a significant bottleneck to using the data to answer the initial business question.
In comparison Data Mesh is more than just technology, Data Mesh combines both technology and organizational aspects including the idea of data ownership, data quality and autonomy. So consumers of data have a clear line of sight around data quality and data ownership and data issues can be discovered and resolved much more efficiently.
Ultimately data can be used and trusted.
Related reading: Data lake vs Data Mesh
The difference between a data fabric and data mesh is that data fabric is a technological approach and that data mesh is about organization, people, and technology.
Data fabric concentrates on a collection of various technological capabilities that collaborate to produce an interface for the end-users that consume data. Many of the supporters of data fabric espouse automation through technologies like ML of many of the data management tasks to enable end users to access data in a simpler way. For simple data usage there is some value in this, however for more complex situations or where business knowledge needs to be integrated into the data then the limitations of Data fabric will become apparent.
Arguably a Data fabric could be used as part of a Data Mesh self-serve platform, where data fabric exposes data to the domains who can then embed their business knowledge into a resulting data product.
As Darnell-Kanal Professor of Computer Science, University of Maryland at College Park Daniel Abadi says the difference between a Data fabric and Data Mesh is not obvious. He advises, “Ultimately, an optimal solution will likely take the best ideas from each of these approaches.”
Related reading: Data fabric vs. data mesh: The differences
Organizations that are ready to implement Data Mesh will need help connecting their data sources for a quick win with Starburst. Below we highlight how:
As you begin your Data Mesh journey the first step is to connect to data sources. A key Data Mesh implementation principle is to connect your enterprise data by leveraging your existing investments: lakes or warehouses; cloud or on-premise; structured warehouse or a non-structured lake. Unlike the single-source-of-truth approach to centralize all your data first, you’re leveraging and querying the data where it resides. It is the first Data Mesh win for many Starburst customers as our 40+ connectors enable the ability to connect to data sources.
After generating connectivity across all the various data sets, the next goal is to create an interface for business and analytics teams to find their data. In data mesh terms, we call that a logical domain. It’s called logical, because we’re not moving data into a repository where data consumers can access it. Rather, we’re creating a logical place where they can log into a dashboard as a semantic layer, to see the data that’s been made available to them.
All the data you need resides in your domain alongside domain teams that are empowered to work autonomously. In essence, we’re promoting the concept of self-service where data consumers are empowered to independently do more on their own.
When you provide a domain team access to the data they need, the next step is to teach them how to convert domain data into data products. Then, with a data product, create a library or a catalog of data products that you can share.
Starburst has a built-in data catalog that enables you to very quickly search, discover, and identify data products that might be of interest and improve the lives of data scientists and data engineers.
Creating data products is a powerful capability as you’ve enabled your data consumers to very quickly move from discovery to ideation as well as to insight, because we’re quickly creating and then using data products across the organization.
Those who are eager to get started or just getting started on their Data Mesh journey for democratization and scalability will find the 90-Day Data Mesh Pathfinder helpful. In fact, many enlist a Pathfinder to help them with this ambitious endeavor. With the right strategy, it is not labor-intensive and there is a low cost, low risk and high reward exercise.
The purpose of a pathfinder is an exercise on how Data Mesh will fit into your organization from a technology, people, and process perspective. You’ll also identify your strengths and weaknesses so that when you’re ready to begin your Data Mesh transformation program, we can curate all the learnings from the Pathfinder to accelerate in the areas where you can move quickly, and slow down in the areas where you need remedial work.
Related reading: The Data Mesh Pathfinder eBook
Related workshop: 2 hour Data Mesh Pathfinder workshop
Data Mesh TV is a monthly educational program for data leaders by data leaders about data monetization, optimizing data products, and accelerating digital transformation initiatives with Data Mesh.
“Data Mesh is certainly the future for our business, and probably for many others, particularly ones which have a legacy of acquisitions, and the need for merging of different data sets to form a new larger entity. Having the ability to query data where it resides using Starburst is enormously powerful and makes a huge impact on the ability for data to provide answers.” Read more
“Decentralized access is definitely the future…We are currently in the process of creating data products, which Starburst is really helping with. Previously, without a single point of secure data access, creating a data product was not possible. With the abstraction layer that Starburst provides across different data sources, it has become our analytics engine for the Data Mesh.” Read more
In this video, Adrian Estala, VP, Field CDO, gives an overview of how to seamlessly integrate Data Mesh into your existing ecosystem.
In this video, Adrian Estala, VP, Field CDO, shares three benefits of implementing a Data Mesh as a digital transformation strategy.
In this webinar, Adrian Estala shares how to create your Data Mesh vision by engaging business stakeholders early in the process.
In this video, Starburst’s Adrian Estala and DAMA Netherlands President Ronald Baan cover data governance elements that many data leaders struggle with.
Starburst’s Lead Solutions Architect Andy Mott explains why Data Mesh isn’t something you can buy off the shelf and what it can do for your organization.
Are you Data Mesh ready? In this session, Andy Mott, Solutions Architect at Starburst explains everything Data Mesh.
Accenture’s Cloud First Chief Technologist, Teresa Tung shares how the Data Mesh paradigm can enable data access by unlocking the value of distributed data.
Hear Valli Musti, Engineering Leader at Priceline, share their story of innovation and their data and analytics journey with Starburst with Data Mesh as their core architecture.
The question of which one to use today (data mesh or data fabric) and whether there is even a question of one versus the other in the first place is not obvious. Ultimately, an optimal solution will likely take the best ideas from each of these approaches.
— Daniel Abadi, Darnell-Kanal Professor of Computer Science, University of Maryland at College Park
With the Data Mesh approach to data management, retailers can more rapidly deploy data strategies that help them better understand their customers and make valuable business decisions.
— Andy Mott MBA, Head of Partner Solutions Architecture and Data Mesh Lead
Data Mesh architecture closes the gap between these transactions and the process of analysis with data ownership granted to individual teams, allowing them to make quick, real-time decisions without the need for data transfer.
— Andy Mott MBA, Head of Partner Solutions Architecture and Data Mesh Lead
Data Mesh, with its deep understanding of technical necessities of data management and breaking down organisational barriers, will and should become the approach of choice if businesses want to strive to become data-driven in their decisions.
— Jess Iandiorio, CMO
The Data Mesh offers a framework for companies to democratise both data access and data management by treating data as a product, curated and governed by the domain experts themselves.
— Justin Borgman, CEO
It’s an acknowledgment that data will be decentralized and that there are advantages to being decentralized, and that really what we’re trying to produce is a single point of access or single point of analytics across all that data regardless of where it lives.
— Justin Borgman, CEO
Treating data as a first-class product drives domain owners to deliver high value and high-quality data for analysis by a wide range of consumers across the organization. I’m proud of the team for delivering what I believe is the first solution of its kind.
— Justin Borgman, CEO
This shift [to decentralized] is business-driven, not IT-driven. This demonstrates the urgency to deliver digital transformation. IT has realized that we can’t migrate to — or sustain — a centralized architecture with the efficiency that the business demands.
— Adrian Estala, VP of Data Mesh Consulting Services
This combination of decentralized data ownership and treating data as a product as part of a Data Mesh approach removes the bottlenecks that come with the traditional data warehousing and data lake models, and in doing so, allows companies to drive faster insights.
— Andy Mott, Partner Solutions Architect
Enterprises have found moving large amounts of data to be cumbersome, expensive and time consuming delaying major business decisions. The Data Mesh concept not only solves this problem but also is a step towards ensuring that data is treated as a first-class product as it is no longer a by-product of an organisation’s operations
— Collen Tartow, Director of Engineering
Companies today realize that it’s a fool’s errand to try to consolidate all of your data into a single data store, and that sentiment is driving the shift to a Data Mesh architecture. Starburst aims to be the de facto query engine for the Data Mesh paradigm.
— Justin Borgman, CEO
Data Mesh is not necessarily about a specific type of technology or code that magically solves data problems at the touch of a button. Instead, it’s about the human side of technology and getting teams to be able to work independently to maximise the value out of data within that organization.
— Justin Borgman, CEO
Achieving a successful Data Mesh architecture requires the ability to access data in disparate systems and sources.
— Matt Fuller, VP of Product
A Data Mesh approach can help financial services better serve its customers and showcase how innovation and success is enabled via a data-driven strategy. Data Mesh decentralises data management and diminishes the impacts of silos and bottlenecks by giving teams ownership, control, and access to their own data.
— Andy Mott, Partner Solutions Architect
With Starburst, TSYS is working towards achieving a sound Data Mesh infrastructure to help their business scale and unlock more data-driven insights.
— Justin Borgman, CEO, Starburst and Mahesh Lagishetty, VP Data Engineering
The data fabric fundamentally is about eliminating human effort, while the data mesh is about smarter and more efficient use of human effort. Of course, it would initially seem that eliminating human effort is always better than repurposing it. However, despite the incredible recent advances we’ve made in ML, we are still not at the point today where we can fully trust machines to perform these key data management and integration activities that are today performed by humans.
— Daniel Abadi, Darnell-Kanal Professor of Computer Science, University of Maryland at College Park
By giving the experts greater control over the data from the beginning of the data management process, businesses will be less likely to lose key data and will be able to bypass common bottlenecks that occur in a centralized approach. The agility of this approach is beneficial for the overall business and will allow for more time to be spent on the analysis, rather than data transfers or depending on the constraint imposed by a centralized IT function.
— Andy Mott, Partner Solutions Architect
As the amount of online data increases and therefore the ability to generate more comprehensive customer insights, it is key to select a data strategy that is focused on removing the impact of inefficient silos and focused on decentralised data for maximum operability and efficiency. This is where data virtualisation can make a significant positive impact, and even more so when it’s part of the adoption of a data mesh based approach.
— Andy Mott, Partner Solutions Architect
— Dan Cook, Senior Product Director of Data Platforms and Analytics, doxo
— Richard Jarvis, CTO, EMIS Group
© Starburst Data, Inc. Starburst and Starburst Data are registered trademarks of Starburst Data, Inc. All rights reserved. Presto®, the Presto logo, Delta Lake, and the Delta Lake logo are trademarks of LF Projects, LLC
Up to $500 in usage credits included