Building an open lakehouse just got easier with Starburst Galaxy & Tabular

Tabular Iceberg, right ahead

CompanyMay 17, 2023

Matt Fuller

Vice President, AI/ML Products

Starburst

Ryan Blue

Co-creator of Apache Iceberg, Co-founder, Tabular

Matt Fuller

Vice President, AI/ML Products

Starburst

Ryan Blue

Co-creator of Apache Iceberg, Co-founder, Tabular

More deployment options

Request Enterprise trial license key →

Start for Free with Starburst Galaxy

Try our free trial today and see how you can improve your data performance.

Start Free

Tutorial: Connecting Starburst Galaxy to Tabular

Today, we are excited to announce that the Tabular connector is generally available in Starburst Galaxy. Now, you can get all the benefits of using Trino and Iceberg to build your modern open data lake without worrying about the operational overhead.

Background

For the past couple of years, Starburst has been sharing Trino’s perspective on the benefits of Apache Iceberg table format and has seen the adoption of Iceberg in the Trino community skyrocket.

It was at Netflix that Ryan Blue and Dan Weeks created Apache Iceberg to solve the pain it was causing their head of data engineering, Jason Reid.

Iceberg is an open source, high performance table storage format that enables an engine like Trino to efficiently perform data warehouse-style SQL functionality such as UPDATE, DELETE, and MERGE commands on your modern open data lake and object storage such as S3, Azure’s ADLS, Google Cloud Storage and MinIO.

Additionally, Iceberg solves other major data lake challenges with capabilities like:

Schema evolution
Simple partitioning for fast data access
Data compaction and retention utilities
Snapshots for reproducible results and rollback

Both Trino and Iceberg are open source projects with vibrant data engineering communities that use and contribute to the projects. And together they enable one to build a truly modern open data lake. However, we know that many data teams don’t have the resources or expertise needed to run this preferred open source software stack. Enter Starburst and Tabular.

Starburst and Tabular

Starburst was founded to solve this exact dilemma for Trino – to help more teams implement and operationalize the OSS query engine as a complete platform with capabilities such as access control, data discovery, catalog search, and data products. This is all provided as a fully-managed in the cloud – Starburst Galaxy .

In the same way Starburst was founded to help teams manage Trino, Tabular was founded by the creators and co-founders of Apache Iceberg, because, as Ryan Blue stated , “data engineers and data scientists exhaust far too much energy fighting the shortcomings of their data infrastructure.”

Tabular is a managed metastore catalog integrated with role-based access controls and a swarm of automated services. The beauty of Tabular is that it provides a secure layer that can be used by any compute framework. This means you can provide a consistent set of access control policies regardless if you’re accessing the data from Starburst Galaxy or Spark.

Get started today with Starburst Galaxy, Tabular, and Apache Iceberg

Now, with the combined power of Starburst Galaxy and Tabular, you can get the optimal experience for managing and operating Trino and Iceberg.

The easiest way to get started is through the new connector in Starburst Galaxy (Configure your cloud storage catalogs: AWS S3, Microsoft Azure Data Lake Storage, Google BigQuery). All you need to do is configure your connection to Tabular via the Galaxy UI, and you can immediately start querying your Iceberg data.

Follow along with this tutorial or watch the video the Tabular team put together for step-by-step instructions on how to get started.

Tabular, Iceberg, Galaxy: On-demand webinar for data engineering and cloud-native organizations

Join Developer Advocate Monica Miller and Iceberg co-creator Ryan Blue as they walk you through best practices for using Starburst Galaxy, Tabular, Iceberg together.

Watch now

How to migrate your Hive tables to Apache Iceberg

Learn more

About Tabular

Tabular is a secure table store that unifies data warehouses and data lakes. You can use one copy of your data and one set of access controls everywhere. Tabular supports Snowflake alongside data lake engines like Spark, Flink, and Trino. Tabular is now an integrated catalog choice in Starburst Galaxy. You can now combine the power of Apache Iceberg and Trino with nothing more than a web browser and an email address.

The Data Engineers Guide to Iceberg v3

Building an open lakehouse just got easier with Starburst Galaxy & Tabular

More deployment options

Start for Free with Starburst Galaxy

Tutorial: Connecting Starburst Galaxy to Tabular

Background

Starburst and Tabular

Get started today with Starburst Galaxy, Tabular, and Apache Iceberg

Tabular, Iceberg, Galaxy: On-demand webinar for data engineering and cloud-native organizations

How to migrate your Hive tables to Apache Iceberg

About Tabular

Tutorial: Connecting Starburst Galaxy to Tabular

Unlocking the power of Apache Iceberg: Tutorial series

CDC with Trino and Iceberg

Near Real-Time Ingestion For Trino