How Unity Catalog Enforces Fine-Grained Policies on Starburst Queries

When governance lives in one place and is respected everywhere

June 29, 2026

Jack Fitzpatrick

Senior Software Engineer

Starburst

Jack Fitzpatrick

Senior Software Engineer

Starburst

More deployment options

Request Enterprise trial license key →

Start for Free with Starburst Galaxy

Try our free trial today and see how you can improve your data performance.

Start Free

Why Building an Agentic Control Plane is Important

A couple of weeks ago at the Databricks Data + AI Summit, I gave a joint session with Alex Jiang, a product manager from the Databricks Unity Catalog team. The topic covered a problem that the two of us have spent a lot of time on from opposite sides. How does a query engine like Starburst honor the fine-grained access control policies defined in Unity Catalog, without re-implementing any of that policy logic in the engine?

It’s a challenging problem, sitting at the intersection of two products that many enterprises run side by side. I want to walk through how we solved it, because the answer is more interesting than a simple connector integration. Instead, it is built on an open standard, the Iceberg REST catalog API, and it is a concrete example of how Starburst has been championing the vision of an open lakehouse.

The state of the open lakehouse

As the lakehouse architecture has been widely adopted across the industry, data has moved from rigid and locked-in warehouses to open table formats, revolutionizing how data is accessed and managed. Open table formats like Apache Iceberg and Delta Lake, paired with a shared catalog, allow many engines to read and write from a single copy of data rather than each keeping its own. That is genuine interoperability at the data layer, and it is the foundation on which everything here builds.

The governance problem

A high degree of openness comes with downsides, however. Traditional governance is disrupted when a single engine does not have full control over an entire organization’s data. The development of open table formats was not paired with a shared policy language, so each engine might implement governance differently, or not at all. Engines cannot be trusted to enforce a catalog’s policies.

Because of this, policies stayed siloed inside individual engines. You could point five engines at the same Iceberg table and still have five different ideas of who was allowed to see which rows. That is the gap Unity Catalog set out to close by centralizing governance, ensuring that policies are respected by any engine using their Iceberg REST catalog API, eliminating duplicated policy layers and compliance risks.

Why coarse-grained policies are not enough

It’s worth taking a moment to consider why other approaches don’t work, namely coarse-grained access strategies. The mechanism many catalogs use to enforce access policies on external engines is through credential vending. When a user runs a query with an engine, the catalog checks that user’s permissions and selectively vends storage credentials for the underlying data files. That works cleanly to control table-level access per user.

Fine-grained access control is difficult

The harder question is fine-grained access control, including policies that filter out rows or mask columns based on the user accessing the data. Credential vending alone cannot support that, as credentials are coarse-grained. They grant access to files, not to a subset of rows or columns inside them. You cannot create a credential that means “you may read this table, but only the non-EU rows, and with the address column masked.”

Additionally, not every engine can be trusted to enforce a catalog’s policies. Policy language differs widely across many different engines and catalogs, and there’s no guarantee one policy will be applied by another engine. Until a shared policy model exists, catalogs need a hook into a query’s read path to redact the data a user sees.

This brings us to centralized enforcement. To ensure policies are applied properly, Unity Catalog must only surface data that has already been sanitized. Fortunately, a recent update to the Iceberg REST catalog API offers a solution.

Supporting server-side scan planning

Scan planning is the process of reading Iceberg metadata files, and using that information to determine which data files need to be read to satisfy a query. Until now, this has been done locally on the client, which has direct access to an Iceberg table’s raw files.

The Iceberg REST catalog API added an endpoint to perform scan planning on the catalog server, bypassing the need for the client to read Iceberg metadata files. More importantly, the plan returned by the catalog server can reference any data files it chooses, not necessarily the raw data itself.

POST /v1/{prefix}/namespaces/{ns}/tables/{table}/plan
{
  "select": ["name", "address"],
  "filter": {
    "type": "eq",
    "term": "region",
    "value": "EU"
  }
}

{
  "plan-id": "<guid>",
  "status": "completed",
  "file-scan-tasks": [ ... ],
  "storage-credentials": [ ... ]
}

Using this endpoint, a client can request a set of columns to select and filters to apply, and it then receives a list of file scan tasks to execute, along with vended credentials.

This results in several benefits over client-side scan planning:

The catalog can control which data is surfaced, enabling more granular access controls
Planning is offloaded to the catalog server, which often has more context than the client
Vended credentials keep permissions tightly scoped to the user running the query
The result is execution-ready, as it follows the Iceberg specification

Fine-grained access control with server-side scan planning

Using this endpoint gives Unity Catalog a hook into the read path of the query. It can identify the user (through Starburst’s existing authentication model), determine what data that user is authorized to see, and redact it as needed. Unity Catalog will generate temporary sanitized data files and return file scan tasks pointing to them. Starburst will then read those sanitized files directly.

From the moment Starburst reads data from the server-side scan planning response, the data is already sanitized. No additional logic is required to enforce Unity Catalog’s policies. Unity Catalog can centralize its governance, and Starburst can be trusted to respect those policies, regardless of how they’re implemented by Databricks.

This trust will extend to any other REST catalog that implements this endpoint as well.

What actually happens when Starburst runs the query

Here is the end-to-end flow, which ships in Starburst Enterprise Platform 481-e STS.

Image depicting how Starburst actually operates when it is used to run a SQL query alongside databricks using unity catalog.

On the Starburst side

As our engine begins planning the stages and splits of a SQL query, we use the Iceberg client to begin scan planning. Instead of reading metadata files locally and creating our own scan tasks to farm to workers, we instead invoke Unity Catalog’s ScanAPI endpoint. We farm those returned file scan tasks out to our workers and execute them in exactly the same way as locally planned tasks.

On the Unity Catalog side

On the other side of that call, Unity Catalog examines the table metadata and the policies that apply to the specific user behind the query, and generates temporary data files that already have row filters and column masks applied. It returns a scan plan whose file-scan tasks point to those sanitized temporary files, along with the credentials needed to read them.

How it comes together

The data is already sanitized by the time it reaches Starburst. There is no masking or filtering happening inside the Starburst query engine. We are not interpreting Unity Catalog’s policy language or re-applying its rules. We are reading data that the catalog has already filtered based on the user.

Why this matters, from a real customer

This is not a hypothetical. One Starburst customer in travel and hospitality runs exactly the multi-vendor data lakehouse that this is built for. They are modernizing their stack, which includes Starburst as a query engine, Okta for identity, both Unity Catalog and AWS Glue as Iceberg catalogs, and Apache Ranger as an extra governance layer on top of Starburst, precisely because their Unity Catalog policies were not being honored by Starburst queries.

Before server-side scan planning, their options were limited:

They could duplicate every Unity Catalog policy in Apache Ranger…
- … resulting in policy drift, auditing gaps, and inconsistency across engines!
They could only use Databricks…
- … giving up federation across all of Starburst’s connectors!
They could just use coarse-grained access controls…
- … and deal with overly restrictive access controls or compliance risk!

With server-side scan planning, the policy they already defined in Unity Catalog applies to their Starburst queries as well. Unity Catalog stays the single source of governance for that data. There is a unified audit trail coming from Unity Catalog, as no other policies are being applied. The redundant Ranger layer is no longer necessary for Starburst queries against Unity-governed tables, allowing them to delete a whole layer of duplicated policy rather than maintain it.

Seeing the workflow in action

In the Databricks session, the live demo is the clearest way to make this functionality concrete. We started with a test table full of the usual personally identifiable information.

Image depicting starburst and databricks working together using unity catalog.

Querying it from Starburst with client-side scan planning returned every row and every column in the clear, as we’d expect from a table with no policies.

Image depicting the starburst side of working with starbucks and databricks.

Then, over in Databricks, we masked the Social Security number column and added a row filter to exclude anyone in Washington or California.

Third image of databricks and starburst working alongside each other using unity catalog.

Without server-side scan planning, the data would appear unmasked and unfiltered on the Starburst side. However, using server-side scan planning, the results have the Social Security numbers masked and the filtered rows gone!

Image depicting the 4th step in using starburst and databricks together using unity catalog.

This functions for any masks, filters, or governed tags applied to your data, with no inconsistencies or discrepancies to results from Databricks.

The power of interoperability

The key takeaway here is the value that interoperability of compute can bring to real, production scenarios. It would be easy to frame two query engines that overlap as rivals in this competitive market. But the thing we keep coming back to, and the reason this collaboration was so successful, is that enterprises do not run just one tool. They run many, choosing the right tool for the job deliberately. The best engine for a given workload is not the same across an entire organization, and the freedom to choose is worth protecting.

What makes that freedom possible are open standards. Server-side scan planning works because the Iceberg REST catalog API is an open specification that any catalog can implement, not a private handshake between two vendors. Unity Catalog enforcing policy at the catalog layer, and Starburst federating across the whole estate while honoring that policy, is the open data lakehouse working as it was intended to. Define your governance once. Query your data with the engine that fits the job. Trust that the rules hold either way.

That is the same idea behind everything Starburst has built around Unity Catalog interoperability, and it is consistent with what we showed at last year’s summit, where the theme was Choice. Choice is only meaningful if it is governed. Server-side scan planning is how the governance follows you across engines.

Starburst is built for interoperable compute

Server-side scan planning for Iceberg REST catalogs is now available in Starburst Enterprise Platform 481-e STS. If you want the deeper technical detail, the 481 documentation covers configuration, and our write-up on Starburst’s Unity Catalog integration is the right place to start if you are setting this up. For the broader picture of why we lean so hard on the Iceberg REST catalog and query federation as the basis for an open data lakehouse, those explainers go further than I can here.

If you are at a point where Unity Catalog governs your data and you want to query it across multiple engines without giving up your policies, this is for you. Define the policy once, and let it follow your data wherever the query runs.

Start for Free with Starburst Galaxy

Try our free trial today and see how you can improve your data performance.

Start Free

The Data Engineers Guide to Iceberg v3

How Unity Catalog Enforces Fine-Grained Policies on Starburst Queries

More deployment options

Start for Free with Starburst Galaxy

Why Building an Agentic Control Plane is Important

The state of the open lakehouse

The governance problem

Why coarse-grained policies are not enough

Fine-grained access control is difficult

Supporting server-side scan planning

Fine-grained access control with server-side scan planning

What actually happens when Starburst runs the query

On the Starburst side

On the Unity Catalog side

How it comes together

Why this matters, from a real customer

Seeing the workflow in action

The power of interoperability

Starburst is built for interoperable compute

Start for Free with Starburst Galaxy