Last Updated: 2023-12-20
Starburst Galaxy's built-in attribute-based access control (ABAC) feature allows business domain owners, platform administrators, and data engineers to apply fine-grained access controls to various data entities. This is done by creating policies around the tags applied to those data entities. These controls are combined with roles and privileges, allowing organizations to enact precise, reusable policies around specific data entities.
The following diagram illustrates this architecture.
You need a Starburst Galaxy account to complete this tutorial. Please see Starburst Galaxy: Getting started for instructions on setting up a free account.
Upon successful completion of this tutorial, you will be able to:
Starburst tutorials are designed to get you up and running quickly by providing bite-sized, hands-on educational resources. Each tutorial explores a single feature or topic through a series of guided, step-by-step instructions.
As you navigate through the tutorial you should follow along using your own Starburst Galaxy account. This will help consolidate the learning process by mixing theory and practice.
The Information Security (InfoSec) team at Chryse Corp. requires all departments to hide pii data(personally identifiable information) from unauthorized users.
Fortunately for the InfoSec team, Chryse Corp. uses Starburst Galaxy, so this is easy to implement. All they need to do is create a role and policy that denies access to data entities with a pii tag. Once the role is in place, it can be inherited by other roles across the organization via Starburst Galaxy's role-based access control (RBAC) features, denying pii access to unauthorized individuals.
In this tutorial, you'll help Chryse Corp. by tagging data entities with a pii tag, and then creating a role with a policy that denies pii access to unauthorized users. You will also use role inheritance to deny pii access.
Attribute-based access control works by tagging data to create a group, then applying a policy to specific tags. These controls are especially useful for domain experts and line-of-business owners who work closely with the data associated with their domain.
Policies work by matching expressions to tags, and allowing or denying privileges based on matches. Starburst Galaxy allows two hierarchical levels of tags. A policy is applied to the top level tag,
x, with the matching expression
has.tag(x). A policy is applied to a specific nested tag,
y, with the expression
has.tag(x.y) or to all nested tags with the expression
has.tag(x.*). To learn more about policies, see our technical documentation page.
The following video walks through all of the steps in this tutorial. Please feel free to watch and follow along with the steps in your own account, or skip to the written instructions if you prefer.
Tags are the foundation of attribute-based access control. In this section, you will create three tags to identify personally identifiable information (pii). Your team has asked you to create tags for customer phone numbers and social security numbers.
Sign into Starburst Galaxy in the usual way. If you have not already set up an account, you can do that here.
Only the data entity owner can add metadata to data entities. In this tutorial, you'll add tags from the accountadmin role.
Now it's time to create a tag for pii data. The tags for phone number and ssn will be nested under the main pii tag. These are special types of pii data, so the tagging process will reflect this hierarchical relationship.
Now you can create the two nested tags. These will tag phone numbers and Social Security Number (SSN) data.
It's best to check that the new tags have been created successfully.
You are going to assign the tags that you just created to tables, so that you can later create policies based on those tags.
In this scenario, the table that you have been asked to tag is the
customer table, which contains several types of sensitive customer information. You're going to tag each of those types based on their attributes and then restrict access to them based on policies and roles.
In this example, you will add tags to the
Now it's time to view your tags, starting with the phone tag. The Column tab on the right should be visible by default.
Now it's time to assign a tag for a Social Security Number (SSN) too.
Now it's time to assign a pii tag for the
Again, it's best to confirm that all of these changes have been added successfully.
Now, you'll bring it all together by creating a role that uses the tags to deny access to pii data. You will also create a second role that inherits the first role to see how the privileges are inherited.
You're creating the role in this step, and in the next step you'll add the policy to deny access.
Now it's time to add a new policy to the tag so that it can be implemented.
New policies require a definition. Complete the required fields to create the new policy.
Each policy in Starburst Galaxy has a defined scope. When you create a new policy, you must outline this scope as part of the creation process.
In this tutorial, you are going to add a new privilege that denies access when selected.
Now it's time to add a new role. In this scenario, you're creating a role for the marketing department.
After the role is created, you will then add the deny_pii role to the marketing role to see how the privileges combine.
You don't want marketing to have access to pii, so you're going to add the deny_pii role to the new marketing role.
Now it's time to add privileges that allow the marketing role to select all schemas inside the
Remember that some columns will be hidden for this role. This is because the marketing role has inherited privileges from the deny_pii role.
Now it's time to outline the details of the new privilege being created for the marketing role. This will outline exactly what the new privilege is allowed to do and what it is restricted from doing.
lakehouse_burst_bank catalog. This will automatically ensure that this privilege applies to all schemas within the catalog.
Now it's time to check whether the new privilege is working properly.
Notice that for both
ssn the Select from table column is denied. If you hover over either of them, you see that this restriction is inherited from the deny_pii policy.
Next, you're going to assign yourself to the role so that you can test it out later.
Your team wants to make sure that the marketing role does not have access to sensitive customer information. This will be an opportunity to test the policies that you just created to confirm that they work. You'll want to make sure that the new privileges allow the correct types of data while restricting the types you intended.
Let's get going!
First, you'll start by level-setting in the accountadmin role. Because accountadmin has broad privileges, you would expect to see all columns and all tables.
You're going to check that this is the case before proceeding.
aws-us-east-1-free cluster is running, and that the
lakehouse_burst_bank catalog and the
burst_bank schema have been selected.
ssn columns from the
customer table. Each of these tests the attribute-based access in different ways.
SELECT * FROM customer;
SELECT last_name,phone,ssn FROM customer;
Because accountadmin has total access to the columns in your query, you should see all three columns in your query results.
last_name column shows results.
phone column shows results.
ssn column shows results.
Now it's time to test the new marketing role to see whether the correct access has been granted and restricted in the appropriate way.
To do this, you're first going to have to switch roles to marketing so you can test its access as someone using that role.
Now you're in the marketing role. It's time to test out the privileges unique to that role. To do this, you're going to start with the 2nd part of your SQL statement.
Recall that the marketing role should have its access denied to the
ssn columns in the
customer table. You're going to re-run your query, but this time expect a different result. Specifically, you're going to expect the
ssn columns cannot be returned, resulting in an error.
Access Denied error.
SELECT statement that you are not allowed to access.
Now it's time to turn to the first query, the one that returned all results. Last time, with accountadmin enabled, you saw results from every column.
This time, you're in the marketing role, so you'd expect the restricted columns to be blocked. Specifically, you'd expect these columns to be absent from the resultset, but all other columns to still be present.
Time to test your hypothesis!
ssn columns in the results of the first SQL statement. This is because the Marketing role does not have permission to view these columns.
last_name column are still visible. This is a bit unexpected. Although it makes sense, it might not be what you were imagining for the marketing role.
has_tag(pii.*), which excludes parent tags.
pii.snn but not
The ABAC policies you set up for the marketing role worked to deny access to customer phone numbers and social security numbers as expected.
However, your team wanted the marketing role to be denied access to the customers' last names as well. Although it makes sense why this isn't the case, it's not quite what the company wanted.
Try to fix the matching expression in the deny_pii policy so that the
last_name column is hidden from the marketing role.
If you get stuck, view the last step below.
Let's edit the matching expression set for the deny_pii policy to add some additional logic. This will help resolve the problem you identified.
Now it's time to make the changes to the logic governing the scope of the policy. This will make it so that the marketing role restricts access to the
has_tag(pii.*) OR has_tag(pii).
pii.* expressions and the parent
Congratulations! You have reached the end of this tutorial, and the end of this stage of your journey.
Now that you've completed this tutorial, you should have a better understanding of just how easy and convenient it is to use attribute-based access control in Starburst Galaxy.
At Starburst, we believe in continuous learning. This tutorial provides the foundation for further training available on this platform, and you can return to it as many times as you like. Future tutorials will make use of the concepts used here.
Starburst has lots of other tutorials to help you get up and running quickly. Each one breaks down an individual problem and guides you to a solution using a step-by-step approach to learning.
Visit the Tutorials section to view the full list of tutorials and keep moving forward on your journey!
Up to $500 in usage credits included