New integrations extend Starburst capabilities to Google Cloud’s Dataplex and BigQuery

Last Updated: January 9, 2024

Organizations are generating larger and larger volumes of data and these datasets rarely reside in the same place or source. Even enterprises that have large data lake footprints typically still have important data in traditional warehouses or other storage systems. 

Yet all of this data needs to be accessible to data scientists and business intelligence analysts if they hope to extract the insights that drive better, more informed business decisions. 

Data centralization creates costly bottlenecks that put strain on infrastructure teams and limit the ability of data consumers to discover, access, and analyze distributed datasets. Organizations large and small are catching up to the reality that data is going to be distributed across and within multiple clouds, data lakes, and warehouses. 

To make it easier to analyze important data across sources, companies are increasingly creating data products, which are high-quality datasets curated from distributed data and maintained by the business domains who know the data best. 

Starburst provides the query engine that allows you to analyze all your data, no matter where it resides, and build data products that join data from different sources – without needing to move it. Downstream data consumers are empowered with an approved, governed, and accessible library or a directory of data products. 

Starburst has a built-in data products directory that enables data teams to very quickly search, discover, and identify data products that might be of interest, expanding the view across and beyond the Google Cloud ecosystem. 

Today Starburst is excited to announce our latest innovation with our partners at Google Cloud, a pair of integrations that bring the power of Starburst to Google Cloud’s data analytics solutions

Extending data management and governance with Dataplex to Starburst customers 

Announced in 2021, Google Cloud’s Dataplex helps customers develop governance and management strategies for federated data products, and one of their answers to the problems that come with siloed, distributed datasets. 

Dataplex is an intelligent data fabric designed to help you discover, manage, monitor, and govern data across your Google Cloud data lakes, data warehouses, data products, and data marts. Many enterprise customers have petabytes of data stored across various data systems and Dataplex provides a central data management strategy that allows customers to build and maintain an intelligent data fabric. 

Starburst helps customers utilize the capabilities of Dataplex for data in on-prem systems and other clouds. With our new integration, Starburst functions as a distributed query engine within Dataplex, making metadata associated with those other data sources accessible within the Dataplex catalog. 

Since Starburst can surface this metadata in Dataplex, users will be able to discover this data even when it’s living in other clouds or on-prem systems. Think of it as a Starburst-enhanced version of Google Cloud’s BigLake. We’re giving companies the opportunity to build and maintain a true data mesh that encompasses all enterprise datasets, including those outside Google Cloud. 

This integration gives customers the ability to use Dataplex to discover and govern federated data products powered by Starburst, curated from data that lives in sources and locations outside of Google Cloud.

BigQuery plus Starburst

A serverless and cost-effective analytics lakehouse, BigQuery offers built-in machine learning, real-time analytics with built-in query acceleration, and the ability to unify, manage, and govern all types of data. 

The Starburst integration with BigQuery enhances those capabilities for BigQuery to read datasets from other clouds and on-prem sources. Starburst surfaces materialized views with BigLake, Google Cloud’s unified storage engine. This way, customers can query data anywhere accessible by Starburst, from other cloud data lakes or on-prem sources. 

The end result is similar to the Dataplex integration: Starburst is effectively helping broaden the reach of Google Cloud’s native tools and providing organizations with access to more data, and the ability to extend their data mesh and, ultimately, discover business insights faster.      

Try Starburst from BigQuery Partner Center

To launch these new integrations, and ensure that as many organizations as possible get the chance to experiment with these capabilities, Starburst will be featured in the BigQuery Partner Center. As a customer, you can click through to initiate a free, two-week, proof-of-concept trial and immediately start extending your BigQuery reach to more of your enterprise datasets. 

You have the data, but it’s not doing your BI and Data Science teams much good if it’s siloed off in separate systems. Why not switch on the power of Starburst and Google Cloud and access all your data instead?

Starburst and Google Cloud

Accelerate your cloud migration journey with Starburst and Google Cloud

Learn more

Start Free with
Starburst Galaxy

Up to $500 in usage credits included

  • Query your data lake fast with Starburst's best-in-class MPP SQL query engine
  • Get up and running in less than 5 minutes
  • Easily deploy clusters in AWS, Azure and Google Cloud
For more deployment options:
Download Starburst Enterprise

Please fill in all required fields and ensure you are using a valid email address.