Starburst Galaxy: Building a Resilient, Multi-Cloud Managed Trino Offering

Share

Linkedin iconFacebook iconTwitter icon

More deployment options

When it comes to Trino, data engineers have a lot of different implementation options. Unfortunately, not all of them are equally viable in production

Starburst was founded to solve this problem by providing the best possible Trino implementation, without compromise. 

What does “best” mean in practice? In essence, best essentially means two things: 

  • Simplicity and ease of use.
  • Enhanced features and performance that go beyond other Trino implementations.

Let’s unpack how Starburst Galaxy provides both simplicity and platform enhancements that make Trino both easier to use and improve functionality for real-world workloads. 

What is Starburst Galaxy?

At its core, Starburst Galaxy is a fully-managed Trino service built around a multi-cloud lakehouse platform using Iceberg and Trino. 

Galaxy is a simplifying force for your data architecture, one that leverages Trino, but goes far beyond it too, including data ingestion and data management. To achieve this, it is designed to handle provisioning, scaling, patching, and securing the Trino infrastructure, so your teams can focus on analytics, not ops.

Ultimately, this is all down to its architecture. 

Starburst Galaxy architecture at a glance

Starburst Galaxy operates as an ecosystem of services distributed across different architectural planes and cloud platforms, connected using Kubernetes clusters. Collectively, these include three domains:

  • Control plane
  • Service plane
  • Data plane 

Let’s unpack each of these planes individually. 

Control plane

The control plane provides centralized management for identity (SSO), RBAC/ABAC policies, orgs/projects, catalogs/metadata, configuration, provisioning, and fleet orchestration across AWS, Google Cloud Platform (GCP), and Microsoft Azure

It coordinates cluster lifecycle, routing, observability, and billing so teams don’t need to manage infrastructure themselves. Importantly, it also controls the stateless services that power the Galaxy UI and APIs, broker requests, and manage connectors and catalogs.

Service plane

The service plane is a query service at its heart. It is responsible for parking the query till the customer’s Trino cluster starts up and is ready to service requests. 

Here, regional access control servers that enforce private connectivity and encryption policies operate as part of a unified governance model. Meanwhile, the regional metadata service is responsible for caching all the metadata related to the catalogs in the region. Result-set caching service that, as the name suggests, is responsible for caching results for repeat usage. 

Data plane

The data plane involves elastic Trino clusters (both coordinators and workers) running in all the supported cloud provider regions on Kubernetes in-region for low-latency access, workload isolation, autoscaling, and predictable performance for interactive and batch SQL.

How do these three planes fit together? 

Starburst Galaxy’s models use the lightweight control plane plus elastic data planes close to your data in each cloud region. Additionally, Trino clusters run on Kubernetes (via the data plane) in each provider for low-latency access and isolation, coordinated by a control plane that handles identity, policy, configuration, and fleet operations.

The end result is shown in the diagram below. 

Image depicting Starburst Galaxy data architecture, specifically 3 planes - the control plane, service plane, data plane, with additional reference to the ingestion plan.

Sample query flow (end-to-end)

Seeing data architecture in principle is useful, but working through an example helps expand how it works in practice. 

Let’s explore a simple query flow to see how Starburst Galaxy operates in principle by unpacking the diagram below, which shows a typical query flow through Starburst Galaxy. 

Image depicting the Starburst Galaxy query flow across multiple stages from ingestion to query execution.

1) Authentication

To begin, a user or BI tool authenticates to Galaxy. From there, the control plane resolves identity and role/attribute context using both RBAC/ABAC governance guarantees.

2) SQL Query submission and service plane assessment

Next, a SQL query is submitted as a post request against a Galaxy Trino cluster.

This request is intercepted by the query service in the service plane, triggering a check for two things: 

  1. Does the user have the right permissions to access the Galaxy Trino cluster?
  2. Is the galaxy trino cluster up and running?

Outcome 1

If the user does not have access to the Galaxy Trino cluster, the query service responds back to the client with an auth failure status code.

Outcome 2

If the user has access and the Galaxy Trino cluster is running, the request is proxied. If the Galaxy Trino cluster is not running (for example, if it is suspended), the query service triggers a cluster start event.

3) Starburst Galaxy Trino cluster

Next, the query lands in the Galaxy Trino cluster, which parses the SQL and resolves referenced catalogs (ie, data sources).

4) Query planning and execution

From there, the query is planned and dispatched for execution. To do this, connectors read from underlying object stores and data warehouses, using provider‑native private connectivity and end‑to‑end encryption, honoring policies and roles.

5) Scaling of execution 

Execution scales elastically based on workload. This means that idle clusters can auto‑suspend to control costs, then resume on demand.

6) Results returned to the client

Next, the results are returned to the client. In these scenarios, lineage and data quality metrics (if configured) are captured in Galaxy or surfaced to integrated tools.

Notably, throughout execution, fine‑grained governance (ABAC/RBAC), masking, and sensitive‑data tagging are enforced by the control and service planes.

Detailed query flow diagram

The diagram below illustrates the same workflow described above in a more detailed and granular way. 

Image depicting a details query flow diagram for Starburst Galaxy, showing data movement across multiple stages.

Query events and query history

Notably, throughout the query lifecycle, Starburst Galaxy stores query metadata detailing query events and query history. These are processed via Kafka replication, then sent to a query loader service that stores part of the data in CockroachDB for 4 weeks, which is used to pull the full event data from Amazon S3. From there, any events older than that are fully stored in Amazon S3.

How this process works across clouds

The Starburst Galaxy query execution process operates across clouds. To do this, it operates in a particular way. Let’s unpack this process in more detail. 

Data federation for universal access with limited data centralization  

First, Galaxy leverages data federation to provide universal access without the need for universal data centralization. In practice, this means that compute is deployed near data sources on AWS, GCP, and Azure to minimize data movement and egress (avoiding huge cross-region and/or cross-cloud costs) while maintaining performance.

Use connectors to connect to multiple data sources

Next, Galaxy connects to all data sources using connectors. This means that it is able to connect to both object stores and warehouses with built-in, managed connectors so analysts can query where data already lives.

Maintain continuous data governance 

All of this occurs with data governance throughout. To achieve this, Galaxy helps keep traffic private with provider-native private connectivity options and end-to-end encryption.

Governance, security, and trust

Data governance is worth looking at in more depth because it’s one of the foundational guarantees that make Galaxy different from other offerings of its kind.

Let’s look at how Galaxy delivers end-to-end data governance in full detail. 

Unified policy model

Starburst Galaxy defines roles and attributes once, then applies policies across catalogs, schemas, tables, columns, and rows. This means that enforcement happens consistently in the control/service planes and is applied in the Trino plane at query time via Galaxy’s access control integration.

Classification and masking

Starburst Galaxy automatically identifies and tags sensitive data. The result is the application of dynamic data masking and row‑level filters so the same query returns different results based on user attributes or roles.

Lineage and auditability

Starburst Galaxy query activity is captured and transformed into running and completed query records and lineage, persisted centrally for audits and impact analysis.

Quality and trust signals

Starbust Galaxy performs configuration freshness checks and data quality rules. This results in surface metrics and lineage in Galaxy or your preferred tools to increase confidence and speed issue triage.

Secure by default

Starburst Galaxy provides end‑to‑end encryption, governed data products, and provider‑native private connectivity to minimize exposure while keeping data in your control.

Data engineer and data analyst experience with Starburst Galaxy

Starburst Galaxy was built for data engineers and data analysts looking for a simpler solution that leverages Trino without the operational overhead. With this as its core mission, it provides the following experience for developers. 

Leverage SQL with the power of Trino

Familiar SQL experience for analysts and engineers, with a clean UI for queries, clusters, and catalogs.

Use an ecosystem of other tools, like dbt

Easy connection to adjoining tools like dbt, allowing you to model, transform, and schedule jobs directly in Galaxy.

Powered by Iceberg, allowing interoperability 

Support for open table formats like Apache Iceberg and Delta Lake, allowing for interoperability at the table level, and ensuring that you’re never locked into a proprietary ecosystem as your architecture evolves.

What’s next? 

And it doesn’t stop there. Starburst Galaxy is always evolving, with so much cool stuff in the works, from the launch of the Starburst MCP server and the Starburst AI Agent, which can help answer any questions you might have about your data. Collectively, these features leverage AI to increase productivity and move at scale.

Our next few blog posts will cover several topics in-depth, including AI, data ingestion (streaming and file-based), networking, telemetry, and access control!

Get started with Starburst Galaxy 

In the meantime, if you’re interested in knowing more about Starburst Galaxy, it’s free to explore. Start a free trial today.

 

Start for Free with Starburst Galaxy

Try our free trial today and see how you can improve your data performance.
Start Free