Data products combine curated data sets with application programming interfaces (APIs) to transform raw data into a product that’s fit for consumption by downstream users. The goal of data products is to make data accessible, consumable, insightful, and actionable for the increasing number of stakeholders who rely on data to inform their decision-making.
Also referred to as “data as a product,” a data product differentiates itself from traditional data lakes or data warehouses in several ways. First and foremost, a data product is not meant to be a centralized or complete source of data. On the contrary, the product contains data specific to just one domain (sales, marketing, DevOps, etc).
Data products treat data as more than just an IT resource. Data has immense value, but it’s hard to extract information that drives business objectives from raw data. Data products take raw data and translate it into something relevant and useful within its domain — a product people can utilize to achieve business goals. The people who build data products are also responsible for security, provenance, and ownership so that the final product better reflects the technical requirements of the data within the domain.
Data products are closely associated with a decentralized data mesh. In a data mesh, domain-specific data sources are linked together but independently managed, rather than consolidated into a single repository, such as with a data mart.
Data marts require a centralized team to take responsibility for data across product sets and lines. With a vast quantity of data to manage and requests to accommodate, users often get “spaghetti” data (deep but narrow) that lacks important context. Data meshes make data easier to access by putting data “closer” to the end users via manageable data chunks for local experts to model, analyze, and build into data products.
The benefits of data products build off the benefits of data mesh more broadly. By replacing centralized data silos with a mesh of independent domains, it becomes possible to create data products in the first place and populate them with highly relevant (and therefore highly valuable) data and features. Data products are where the advantages of a data mesh become tangible.
Data products can take countless forms across multiple different domains, but there are five common types:
The IT team at a hospital is responsible for managing and maintaining all the medical devices in use throughout the facility. But that is a huge number of devices that are continuously moved throughout the hospital, and devices get added or subtracted regularly. For the IT team to do their job, they need to know the location of each device in real time.
Each location is a data point. A data product collects those data points, presents them in an easily digestible format, and automatically updates the interface as devices change location. The IT team may have created the data product and be the primary user, but other teams will also want to know the location of devices, so the data product shares data within and between domains. That way, the nursing team can find a device when they need it, or the accounting team can track device costs. This and all data products bridge the gap between data and the people who need it.
The benefits of data products are unique for those who build them compared to those who use them:
The philosophy of data products — treating data as a product — helps to focus the efforts of data creators. This is in contrast to data projects of the past, which were bigger undertakings trying (and often failing) to accomplish many things at once. Data creators working within one domain have more expertise and more resources to build something that not only works but drives value.
Data products can automate much of the time people spend finding, organizing, and analyzing data while eliminating many of the errors they produce in the process. Furthermore, streamlined, self-service access to highly relevant data leads to superior decision-making by equipping end users with exactly what they want/need to know. Data products drastically improve the odds of getting it right.
Distributing data through data products is fundamentally less challenging than fulfilling data requests individually in terms of technical and logistical hurdles. However, treating data as a product creates some new challenges unique to the realm of product development that no amount of domain-specific expertise can easily help overcome.
Great data products combine product, business, tech, and data perspectives into something that excels at all four, but that’s difficult to do. Products that are both valuable to use and practical to build are rarely obvious, and promising projects can go sideways without carefully managing development and enlisting diverse inputs. Ideas are common — amazing products (data or otherwise) are not.
Developing a data product requires both technical validation (the product works) and user validation (people like it). Leaning too heavily in one direction or the other can compromise the finished product, but striking the right balance proves difficult, especially when working on short deadlines.
Data products require successive refinement to get right. And since the data they draw on changes (sometimes significantly) over time, products based on that data will change as well. Even the simplest data products take regular evaluation and iteration, but it can be easy to neglect that effort, and the wrong changes make a product worse.
Starburst makes data products effortless to create and exciting to consume — unleashing the power of data like never before. See what Starburst can do for your business today.
Up to $500 in usage credits included