A recent Gartner survey of over 2,000 CIOs highlighted the importance of accelerating the time to value from their digital investments. The survey also noted, “CIOs’ future technology plans remain focused on optimization.” Many of these organizations are now looking at data products to propel their digital ambitions and to optimize data analytics. Data products help get the right data, in the right package to the right people. As part of a broader Data Mesh strategy, data products will create a competitive advantage for organizations that understand how to optimize them.
The data product race has begun, but this race isn’t about how many data products you can create, or how many data sources you can integrate. The winners in this race will be judged on delivery efficiency, consumer adoption and business value. The sustained winners in this race will possess the agility to continuously optimize data products in order to exploit external and internal data-driven opportunities.
This article is focused on optimizing across four key levers: cost, performance, risk and user experience. These levers should be balanced based on your priorities, and a careful understanding of the tradeoffs.
Define the Bigger Strategic Context
Before we dive into the data product optimization levers, let’s paint the bigger picture. We don’t optimize in a vacuum, we optimize across technology, process and people as part of a broader strategic plan. This article is focused on data products, but consider each of the following areas as you define your optimization goals:
Delivery Efficiency: Manage the Full Data Product Lifecycle
Delivey efficiency is impacted by every layer of the ecosystem, from the enterprise architecture, to the data platforms, to the domain teams that create the data products. We need to look at all avenues for driving efficiency, at every layer and in every stage of the data product lifecycle. Measure efficiency from the consumer lens and focus on the biggest bottlenecks. Instead of waiting for a long migration project, can we provide the consumer immediate access with a high-performing MPP query engine like Trino. Instead of relying on a central team of engineers to manage a back-log of transformation tasks, can we enable domain teams to work autonomously with self-service tools.
Consumer Adoption: Embrace Simplicity, Abstract Complexity
In a Data Mesh, you want to abstract the complexity and create a low friction environment. Your consumer wants a simple, fast process for finding the reusable, trusted data products that they need. To drive adoption, consumers should be confident in the data that they are accessing and in the tools that they are using. They need training that describes the data management process and accountabilities. Understanding the process will help them gain trust for their data products. Empowering a consumer to move at their own rapid pace to solve the problems they understand the best is an incredible motivator.
Business Value: Optimize for Competitive Differentiation
Optimize in the areas that matter for your IT and business objectives. The current market conditions may drive data product cost optimization to the top for your industry. For a different industry, they may decide that data product compliance is critical for protecting their business models. Within a single organization, you will have various types of data products that can be optimized differently. An automation domain may have a higher acceptable TCO for their data products due to the volume and velocity of the required data sets. As market conditions shift, you want the agility to re-optimize your design. As you consider each of the optimization levers, focus on the areas where differentiation creates real value today and then adjust with agility.
4 Data Product Optimization Levers
In the remainder of this article, we will describe four optimization levers. These four levers are a great launching point that are relevant for anyone that is getting started.
Optimize your Data Products for Cost
The consumer will appreciate the cost savings that comes from the acceleration of project delivery and time-to-value. Project teams that were often stalled waiting for data pipeline and transformation backlogs, should now have the autonomy and self-service capabilities to build and reuse data products.
To maximize your investment, look at the cost of building and changing data products. Who is building them, what skills are required? How do they get their data, how long does it take? What does it cost to change a product? What is the total cost of operations? Consider these levers for managing your TCO:
- Optimize Data Movement. Data movement is going to increase costs (e.g. cloud egress, duplicate storage) and delay business delivery. You can build high-performing data products from your existing on-prem and multi-cloud data sources. Minimize migration, limit the data that needs to be moved by accessing it where it sits. Data migration should be the exception not the rule.
- Optimize Compute. For some products, transformation can be accomplished with very little transformation, others may require heavy compute time. Ideally, you want a platform like that can automatically optimize when, where and how the compute cycles execute. Be careful, with some platforms a simple mis-configuration could lead to an expensive compute lesson.
- Optimize Patterns. Data product templates promote consistency and they encourage teams to use the most efficient patterns. Provide your data product owners with a set of archetypes that meet common use cases and define why the pattern is preferred in terms of the 4 optimization levers (cost, performance, risk, user experience).
- Optimize Operations. Runbooks should be defined for each domain. They should align with the enterprise, central IT standards for areas like service tickets and access requests. There is a part of the runbook that is non-negotiable and areas where the domain can make the best choices for their specific objectives. Provide guidance for how to customize operations in the areas that create competitive differentiation for their domain and data products.
Optimize your Data Products for Performance
Performance is based on speed and reliability, measured across the full data product life cycle, from integration with sources on the back-end, data transformation in the center and data consumption on the front. Your optimization design will vary, depending on the requirements that fit your specific data product use case. You should be able to deliver the right level of performance, at the right cost and within acceptable risks.
- Optimize Integration. Use market standard connectors that are fully supported and specifically designed to access data for the target source. Maintaining the connectors on your own increases costs and they will not provide the reliability that you need. Look for a platform that provides a broad range of standard, high-performing connectors.
- Optimize Recoverability. Leverage a platform that can help you optimize for speed and reliability. If a data product transformation fails due to an issue anywhere in the lifecycle, you need to be able to automatically recover.
- Optimize Project Delivery. It doesn’t matter how fast the final data transformation executes if it takes the project team a month to migrate the data into the analytics engine. Use the right engine for the use case. Your goal should be to meet the minimum data product requirements for the business solution, as fast as possible. You should define standard, optimized data product development processes that are embedded within your project teams.
- Optimize Consumption. The data product should be easily accessible by the consumer, whether they are building dashboards, applications or advanced analytics. Find it and obtain the necessary entitlements with speed. Identify problematic or unused data products, remove them to further enhance the user experience.
Optimize for Data Products Risk and Compliance
Optimizing for risk and compliance ensures that you are applying the right level of controls for the given situation. A typical model is one where you apply a standard, consistent level of minimum governance for every domain and data product. Then, you enable each domain to raise additional governance levers per their unique requirements. Let’s consider a few of these:
- Optimize Governance. Domains and data products should be designed with a governance approach that is aligned with their unique risk profile. An innovation domain that is driving discovery and innovation within a closed lab should be allowed to operate with minimum controls. A different domain that is supplying products for a highly-regulated financial report will operate with higher controls.
- Optimize Access Management. Products should be designed to help accelerate secure consumption and democratization across the organization. To improve the user experience, teams should be allowed to read the basic descriptions for every data product. But, they should only see the actual data if they have the proper entitlements. Your access management process should enable teams to request access at the data product level, and receive a timely response.
- Optimize Auditability. You should be able to provide an audit trail from the moment data is collected from the source through the development of the data product and up to the point that it is consumed. The trail should define time, user and the data sets. This level of auditability will enable your infosec and compliance teams to accurately detect issues.
Optimize Data Products for User Experience
User experience improves with data products because it provides them the ability to quickly find what they need and actually understand the data that is being presented. This is a game changer. Whether you are a data scientist or a business analyst, data products will improve your productivity. If you want the consumer to drive fast, give them keys and get out of the way.
- Optimize Catalog. Improve the user experience by providing a catalog or market place where teams can easily search for, study and consume data products. To truly optimize the user’s experience, promote strong metadata practices that ensure a high-level of consistency. Every data product should have a rich description that any consumer can review and understand. When you are searching for data, the metadata and descriptions are invaluable, they save time and minimize mistakes. Set a high standard and challenge your data product owners to exceed it.
- Optimize Reusability. Reuse promotes consistency, it reduces duplicate development efforts and it accelerates delivery. If you are just starting your Data Mesh journey or building a new domain, it is expected that you will be building more new data products and reusing existing sets. But once you create some momentum, reuse becomes more valuable. To promote reuse, data product owners should be open to answering questions or receiving feedback on how to improve their products.
- Optimize Adoption. The user’s experience will be greatly improved if you spend the time to train them. This is as much about setting expectations as it is about exploiting capabilities. Offer varying levels of certification, from citizen to astronaut and encourage users to pursue continuous learning. Motivate users by enabling new capabilities, based on their learning achievements.
The data product race has begun, are you ready to get in the game? It is never too early to start thinking about optimization, but you may not be ready to actually execute a meaningful program.
If you are just starting a Data Mesh Pathfinder with your first domain and data products, you are not ready to optimize. This is a good time to learn and align with your data producers and consumers; to define your optimization priorities. Allow data product teams to learn, run fast and invent. Use this time to fully understand the optimization capabilities offered by your platform and the broader ecosystem.
You should start your optimization journey after you complete your initial pathfinder or MVP. Work with a trusted Data Mesh advisor to review your designs and to help you identify where to get started.
The big winners in this race will possess the agility to continuously optimize data products in response to internal and external changes. If you have the right platform, optimizing across these four key levers (cost, performance, risk and user experience) will provide you with the competitive advantage that you need to beat your competition.
Enable data producers and consumers to create, publish, discover, and manage curated datasets