Building and governing a data mesh with Starburst and AWS Lake Formation

Last Updated: May 13, 2024

The increasing popularity of data lakes isn’t surprising anyone in the analytics space. The appeal of importing data from multiple sources into a single data lake, then providing access to hundreds, or even thousands of users, is undeniable.

The catch? Building enterprise-scale data lakes is a historically cumbersome, complex, and extremely time-consuming process.

AWS developed AWS Lake Formation as an answer to these challenges, giving organizations the tools to quickly build data lakes while simplifying security management. In short, AWS Lake Formation addresses the technical complexity and security challenges of building a petabyte-scale data lake. The technology will be an enormous help to AWS customers burdened by distributed data access and security, even within the AWS ecosystem.

I’m excited to share the news that Starburst Enterprise now supports, in public preview, AWS Lake Formation.

Starburst and AWS Lake Formation

This is just the latest evolution in our longstanding partnership. AWS has been a critical Starburst partner since our early days, and we’ve helped our joint customers uncover transformative insights, reduce costs by 50%, accelerate queries, and much more. Yet this latest development could have an even broader organizational impact.

AWS Lake Formation has emerged as an essential tool for companies looking to shift to a data mesh, the architectural framework that delivers faster, more accurate access to distributed datasets. In addition to its key security benefits, AWS Lake Formation provides self-service data access capabilities that are core to the data mesh approach and robust data lake analytics.

Starburst and AWS

AWS Lake Formation makes it easier for AWS customers to set up a secure data lake. It helps customers import data into Amazon S3, catalog and label data, and provide row and cell-level security and access controls.

AWS Lake Formation provides a single place to manage access controls policies. You can define security policies that restrict access to data at database, table, column, row and cell levels. These policies apply to AWS Identity and Access Management (IAM) users and roles, and to users and groups when federating through an external identity provider.

Modern Data Lake For Dummies

Free eBook

Data mesh + data lakes

Yes, building data lakes can be difficult, but implementing a data mesh architecture isn’t necessarily easy either. The benefits are significant, though, as many of our customers will attest, and there’s a fundamental appeal to the data mesh approach, as it acknowledges the reality of most large enterprises today. Organizations don’t have all their data in one place. The single source of truth, for various reasons, remains an unrealistic dream.

Starburst makes it easier, more efficient, and faster to extract insights from distributed data, and a Starburst-driven data mesh architecture only amplifies these core benefits.

Which brings me back to AWS Lake Formation. We’re now supporting read access on Hive, enabling AWS data lakes to function as another node in a distributed AWS data mesh architecture, and giving BI analysts, data scientists, and other users fast query access to an entirely new dataset, and data products.

Now available in public preview:

  • Customers can apply AWS Lake Formation policies on the Hive catalog for read-only access
  • Customers can enforce AWS IAM and AWS Lake Formation policies for fine-grained access control

Together with AWS, we’re going to be helping customers build data mesh architectures that enable cost-effective, multi-region analytics at scale. Our joint customers will be able to extract insights from their distributed data without compromising security, compliance or data integrity.

AWS Lake Formation Integration: All about insights

Ultimately, we want to help enterprises to turn data into results, and that means providing fast, secure access to all their datasets, regardless of where that data resides. We’ve already seen the impact of data mesh architectures at some of the largest organizations in the world.

We know the data lake is here to stay, and that it will prove to be a key source within a data mesh. It also requires robust data lake analytics and efficient data pipelines.

Now, with AWS Lake Formation support, we’re excited to see how Starburst and AWS can help more enterprises start turning all that data into actionable insights and profitable programs.

We are thrilled to announce our new AWS Lake Formation Integration (public preview) to enable organizations to federate data securely beyond the AWS ecosystem.

Together, Starburst and AWS Lake Formation are on a journey to help joint customers implement the technical aspects of data mesh, the framework for distributed data management which ensures faster and more accurate access to critical data to drive business decisions.

By federating data across multiple AWS and third-party environments with Starburst Enterprise, organizations can realize the full value of their multi/hybrid cloud investments, achieving business insights faster, regardless of where the data resides.

No more cumbersome, complex, and time-consuming configuring and managing data lakes, in addition to other sources. Customers can now leverage the fine-grained security for AWS and access to data anywhere with Starburst Enterprise.

See Starburst Enterprise in action

Download or upgrade your version today

Learn more

Start Free with
Starburst Galaxy

Up to $500 in usage credits included

  • Query your data lake fast with Starburst's best-in-class MPP SQL query engine
  • Get up and running in less than 5 minutes
  • Easily deploy clusters in AWS, Azure and Google Cloud
For more deployment options:
Download Starburst Enterprise

Please fill in all required fields and ensure you are using a valid email address.