Operationalizing data products at scale with AI

See how AI and data products are coming together to improve the user experience and to accelerate the delivery of innovative business solutions

Share

Linkedin iconFacebook iconTwitter icon

More deployment options

AI capabilities are at the top of every data strategy and referenced in every executive boardroom as a strategic business imperative. AI, in its broadest form, has been around since the 1950s and has been a specialized data capability in many industries for a long time. We have been using AI across all sectors. Yet, just a couple of years ago, AI was not at the top of most executive data agendas. How did AI suddenly become the dominant topic in digital transformation?

Data engineers and scientists point to some significant shifts coming together at the right time. Rapid advancements in GPU and neural network technology, enabling the use of massive data assets for model training, have fueled the exponential growth of AI innovation. 

My view is that the most significant factor in the recent popularity of AI is the perception that it is easy to access for anyone. When ChatGPT became publicly available, it felt as real as that self-driving car, except there was no waitlist or prohibitive cost. Everyone, everywhere, was instantly empowered to ask their own questions and seek their own insight. To be proficient in using ChatGPT, you didn’t need two PhDs or know how to talk to a REST interface; you just had to learn how to ask the question. ChatGPT users quickly shifted from using a traditional search engine to prompting an answer engine. 

I see the same rapid shift when teams start using data products. Consumers stop worrying about searching and waiting for data and start focusing on using it to build business solutions and insights. Data products drive the perception that working with data doesn’t have to be difficult.  Consumers feel that the ChatGPT moment comes when using data products is so easy that you forget how hard it used to be to find the correct data. Implemented correctly, AI and data products can work together to accelerate adoption and improve ease of use. 

This article explores how AI and data products are coming together to improve the user experience and accelerate the delivery of innovative business solutions. Some of these are based on conceptual discussions, and others are based on actual use cases. Early strategic vision is usually painted with some crayons (curiosity and imagination) and watercolors (experience and detail). 

Data Products For Dummies

See the future of data products with large language models

Read now

Viewing data products with a business lens

Before we get into some examples, I want to first position the business perspective. When we talk about AI, our business customers immediately think of automation and rapid answers. The how isn’t as important; it is all about the what and when. What can we do right now to give us a competitive advantage, or what is the market doing that we are not?  The business will view data products based on the solutions they enable, and most of them may never see a raw data product – all they see is the final insight. When we talk about AI-driven data products, your business teams will hear “faster data solutions and better insight.”  At its core, access to more data drives AI innovation. 

This is where Starburst emerges as a transformative force, reshaping the landscape of data product management with a distinct business-oriented perspective and fueling the exponential growth of AI adoption. 

The image below demonstrates how Starburst is being used to accelerate analytics across leading data lake solutions while also enabling performant federation across other cross-cloud or on-prem data sources. For data science teams training new models, immediate access to the data and the lakehouse’s power are the game changers. Business teams can focus on developing new AI-driven business solutions. Starburst will abstract that back-end data architecture complexity. 

Streamline data product design

Envision a future where data products are automatically created and recommended to consumers based on enterprise and industry trends for their unique profiles. To train the engines, we provide data on usage trends, ontologies, and user profiles, and we never stop training. The AI engine develops the products and automates the documentation. AI will pull together metadata, fill in gaps, and generate a data product description tailored to each consumer. These pre-built ‘answers’ will accelerate the development of new ideas, new questions, and new business solutions. Data access will be dynamically defined based on a set of attributes that create an accurate, instant risk profile. 

AI can analyze historical performance, risk, and user experience across different design patterns and automatically select the ideal design for specific product types. AI can also help to create a personalized user experience by predicting which features a user will find most useful and customizing the design accordingly.

What we are seeing today:

  • AI-driven knowledge graphs are very useful for exploring new data product opportunities. Even if you can pull all the metadata together into a data catalog, it provides an excellent accelerator.  
  • A non-technical user can use natural-language prompts in ChatGPT to generate a query for the requested datasets. That query pulls the data together for a data product.
  • Attribute-based access controls are being applied to data products. As these use cases mature, we can expect AI to play a stronger role. 

Optimize data product operations

I believe that data product operations will see the fastest evolution with AI, because the potential value is incredible. In the near future, AI will be used to automate data product operations from end to end. AI will manage data quality in real-time, checking for errors, inconsistencies, or anomalies. AI will be used to identify and mitigate potential security risks, calculating dynamic risk profiles as data sets are continuously aggregated. 

Predictive maintenance algorithms will identify and correct operational bottlenecks and potential failure points before they result in downtime. If data changes on the back end, it will be detected, and the data product will be automatically adjusted to ensure consistency. If data is suddenly unavailable, an algorithm could use trends to predict the missing data sets and ensure continuity of the front-end solution. 

What we are seeing today:

  • AI is already being used across IT Operations, from support chatbots to cybersecurity to predictive failure analysis. 
  • AI-driven ‘virtual-engineers’ are being used to review query performance across a global enterprise. 
  • AI is used to identify and correct data quality issues, validate data inputs, and highlight duplicate records. 
  • AI is being used with data observability tools to improve data lineage and overall data health.

Accelerate data product consumption

AI will materially simplify and enhance the user experience in the future. AI will provide personalized insights based on each consumer’s profile. As consumers continue to reuse and create new data products, the AI engine will learn which types of data products to recommend and how to design them. Consumers will interact with data in more intuitive ways to accelerate ideation and new insights. AI can also automate report and dashboard generation, reducing the need for manual analysis. 

What we are seeing today:

  • AI visualizations are already facilitating active exploration of data sets, and this is improving rapidly. 
  • Data products are being used to accelerate AI models, enabling data scientists to quickly find and reuse data sets that are key to new models.
  • Natural language questions are used to support data product exploration and integration.  
  • AI-driven chatbots are being used to provide consumer support, helping to address data product questions and tickets

Challenges

This article presents an optimistic, ambitious view of how AI and data products could work together in the future. I want to paint a picture of the art of the possible and ground it in some of the fundamental advances we are already seeing. We should also recognize that neither AI nor data products will fully succeed until we address challenges in data qualitycompliance, and business-focused data ontologies, to name a few. My advice to teams focused on these challenges is to pause and reset your approach. Every data governance leader should be managing a strategy for how data will be consumed in the future (e.g., data products) and how it will be transformed (e.g., via AI). 

Getting Started

Data products are used across industries to deliver many different types of solutions. If you are just getting started, the best advice is to start with a small team and a handful of data products. It is essential to get the IT teams and maybe one small business team good at using data products within your environment, before you expand. You can add automation and additional features in the future phases; the initial goal is to learn how data products fit into your current ecosystem (process, people, technology). I would advise against building a large platform or migrating any datasets; this will just add unnecessary cost, delay, and complexity to your initial MVP. Transformation initiatives need early wins to build momentum and confidence to overcome the bigger challenges. 

FAQs about AI and AI Data Products

How do AI and data products make data more accessible for business users?

AI and data products, together, simplify data interaction, offering a “ChatGPT moment” in which complex data access becomes intuitive. This shift allows consumers to focus on deriving insights and building business solutions rather than struggling to find or prepare data. By abstracting backend complexities, these tools enable users to easily consume data, much like generative AI tools empower anyone to ask questions and get answers.

In what ways does AI contribute to the design and operational efficiency of data products?

AI streamlines data product design by automating creation, recommending products based on trends, and generating customized descriptions from metadata. In operations, AI can automate quality checks, identify security risks, and predict maintenance needs, ensuring continuous data consistency and availability. This proactive management significantly improves the reliability and performance of data products.

How does Starburst facilitate the integration of AI with data product strategies?

Starburst acts as a transformative force by abstracting complex backend data architectures, thereby accelerating analytics and enabling performant federation across diverse data sources. For data science teams, it provides immediate access to massive data assets for model training, fueling AI innovation. This enables business teams to concentrate on developing new AI-driven solutions without grappling with underlying data infrastructure challenges.

What initial steps should organizations take when beginning their journey with AI and data products?

Organizations should start with a small, focused team and a manageable number of data products to learn how they integrate into their existing ecosystem. The initial goal is to understand processes, people, and technology, rather than building a large platform or migrating extensive datasets, which can add unnecessary cost and complexity. Focusing on early wins helps build momentum and confidence to tackle larger challenges later in the transformation initiative.

Operationalizing data products at scale with AI

The impact of AI on data products

Start today