In this episode of Data Mesh TV, I met with Nachiket Mehta, Head of Data & Analytics Engineering, Supply Chain at Wayfair. Nachiket is an experienced and recognized leader in the data mesh community, his presentations lean on real experiences and he provides innovative ideas.
For episode 16, I dissected how a mesh approach impacts different personas across the data supply chain. Nachiket provided incredible content that is still generating positive feedback, long after our live recording. You can watch the episode here and we will highlight some of the main points below.
Which persona leads the design?
I prefer to start with the data consumer, I want to understand their unique line of business data needs and challenges. I also want to establish a clear view of the business problem that I am working to solve.
Nachiket takes a more balanced approach, he argued that all of the personas should have equal footing in the early design phase. Obtaining full enterprise commitment requires support at all layers. This early support is key to navigating the change and pain that is part of every transformational program. Bottomline: we should focus on all the personas to build strong, early alignment.
We broke down the primary personas into three categories and discussed how the mesh might be viewed from each lens.
For data producers, a mesh approach provides much better transparency for how their data is being used, from the source to the solution. That visibility drives new behaviors; it motivates the producer to improve quality and metadata; and it helps them to focus their efforts on the data sets that are actually being used.
Equally, it creates new partnerships or contracts in the governance of that data. As data is integrated into curated data products, you have people closer to the business helping to improve the metadata using a language that is much easier for the consumer to understand.
This has become a very diverse persona group. There are citizen consumers, data science consumers, external consumers and even… yes, AI consumers. As you build out your mesh design, you need to work to understand the unique needs for each of your consumer groups.
One of the greatest attributes in a mesh design is the incredible flexibility that it offers so that you can build a domain and data products that are fit for purpose, with the right level of governance, the right size and the specified interoperability.
The data engineers persona created a really lively discussion, it always does and we can thank Joe Reis for that. Is this ‘Data Engineers’ or ‘Data Product Engineers’ or ‘Data Infrastructure Engineers’ or none? We both agreed that this part of the data supply chain is overwhelmed and that a mesh approach can, ultimately, provide some great relief. The vision for a self-service platform that enables autonomous domain teams to work faster and on their own takes time to develop and mature.
Engineers are often assigned to domain teams or they are asked to work across domain teams, as those teams get up to speed. Some teams adopt this new way of working rapidly and other teams may not be ready to take the keys and drive. Engineers will continue to be overwhelmed until these domain teams are ready to pick up the required skills and take on the self-service capabilities. There is a lot more to discuss here on topics like data contracts, engineering specialities and where architecture fits in.
Nachiket gave an incredible overview of how he developed a data ontology for his supply chain. This was my favorite part of the episode. Wayfair established ontological concepts for all their domains. It drives domain-driven ownership, to ensure that the experts are driving, defining and sharing the information. These responsibilities address the technical and the ethical concerns that define how data is managed in each domain. Data producers, engineers and consumers are all part of their domains.
When building domains, it is important to promote best practices. You want to share winning ways of working across all of your domain teams. When teams don’t follow the best practices, it creates challenges that strain the shared elements of your mesh platform.
We want data to be part of an enterprise shared economy, where openness and transparency drive faster innovation and customer satisfaction. We want to share everything, but there must still be boundaries and those boundaries are not clear anymore.
Data privacy, sovereignty and data ethics are clear boundaries that are established and embedded into the control framework. However, what’s not as clear are the boundaries between lines of business or between different companies that govern how data is shared. We want to share data, and data products offer a manageable vehicle that can be secured and easily consumed. We are just scratching the surface on how these data products will open new doors for sharing across the business ecosystem.
Data contracts deserve its own episode and we did our best to skim the surface, but data contracts is such an interesting area. Nachiket talked about the importance of monitoring and enforcing the data contract agreements. What happens when a contract is violated? How do you assess the impact and who is responsible for the response?
As a CDO, I wrote data contracts for all of my data owners and even had executives co-sign the agreements. They created awareness for accountability, but we never established a process for enforcement. They were not effective, Nachiket is right, we need to establish contract management.
Data producers should own the data contracts and are accountable for enforcing the policies, but they need effective tools that can help automate the monitoring and reporting. I don’t think that data contracts should be consistent across the organization, I think those contracts will vary based on the risk profile for the associated data sets.
Nachiket and I agree that there needs to be a minimum level of governance that is consistent, but we disagree on how much the governance could vary from team to team. For some teams, the data contracts can be designed to support great agility, autonomy and innovation.
Another area where we did not completely agree was on the topic of service level agreements. Nachiket felt that every data product should have service level objectives and defined service level indicators.
I largely agree, but I have seen data products that are used in fast-paced, highly innovative environments with a very low SLA. The service level agreements must be fit for purpose, so that you can ensure that your cost of ownership, governance and agility is aligned with the business objectives. We have seen what gold-plating did to our application portfolios ten years ago, we don’t want to repeat those mistakes with data products.
Data mesh workbook
Start thinking through data mesh concepts and discussions you’ve learned and apply them to your own organization