Last Updated: 2024-04-16

Background

AWS PrivateLink provides private connectivity between virtual private clouds (VPCs), supported AWS services, and on-premises networks without exposing traffic to the public internet.

Starburst Galaxy supports AWS PrivateLink for some of its catalogs. In this tutorial, you will learn how to set up PrivateLink for Amazon Redshift.

Scope of tutorial

In this tutorial, you will learn how to configure AWS PrivateLink for Amazon Redshift.

Learning objectives

Once you've completed this tutorial, you will be able to:

Prerequisites

About Starburst tutorials

Starburst tutorials are designed to get you up and running quickly by providing bite-sized, hands-on educational resources. Each tutorial explores a single feature or topic through a series of guided, step-by-step instructions.

As you navigate through the tutorial you should follow along using your own Starburst Galaxy account. This will help consolidate the learning process by mixing theory and practice.

Background

If you are configuring PrivateLink for the first time you are encouraged to work with a Starburst technical resource. This individual will work with you to set up the environment needed to complete the tutorial.

Contacting your technical resource

To be assigned this resource, you should reach out to your regular Starburst account team for assistance.

Working together

Once assigned, your Starburst technical resource will work with you to set up an environment where you can complete the tutorial.

Please review the following overview of this process before beginning the tutorial.

Your responsibilities:

Background

Understanding the Redshift PrivateLink architecture is important when completing the steps in this tutorial. In this section you will learn about this architecture and the way that Starburst Galaxy uses it to securely connect private clouds.

This tutorial also follows corresponding AWS documentation on the topic. It is recommended that you consult this documentation if you want to learn more about AWS PrivateLink in general.

Reference architecture

The following diagram illustrates a PrivateLink connection between the Starburst Galaxy VPC and the Amazon Redshift VPC.

Review the diagram and corresponding notes below for more information.

  1. Once the PrivateLink configuration is complete, an endpoint is created in the Starburst Galaxy VPC (Source).

    This endpoint connects to a Network Load Balancer located inside an endpoint service situated in the Redshift cluster VPC (Destination).

    This establishes a private connection between Starburst Galaxy and the Redshift, enabling PrivateLink functionality.
  2. In this reference architecture, the Starburst Galaxy VPC is the source.
  3. In this reference architecture, the Redshift VPC is the destination.

Background

Amazon Redshift can be deployed in two ways:

In each case, the process of obtaining the necessary information for PrivateLink differs slightly. Despite this, the way the information is used remains very similar.

Option 1: Redshift Serverless

If you are using Redshift Serverless, you will need to obtain:

Option 2: Redshift Provisioned Clusters

If you are using Redshift Provisioned Clusters, you will need to obtain:

Select appropriate deployment option

To continue with this tutorial, select the appropriate deployment type matching your Redshift deployment.

Background

Redshift Serverless deployment is one of the two types of Redshift deployment.

In this implementation, users do not need to provision or manage the underlying infrastructure. Instead, AWS itself handles the infrastructure management. This includes autoscaling of the cluster based on workload demands.

Step 1: Sign in to AWS console

You're going to start by signing in to your AWS console.

Remember that this should be the AWS account containing the Redshift warehouse that you would like to connect using PrivateLink, so if you use multiple AWS accounts, ensure that you pick the correct one.

Step 2: Select Redshift Serverless

Next, you're going to enter the Redshift Serverless section of AWS. You can get there from the left-hand navigation bar.

Step 3: Select Namespace

Step 4: Select Workgroup

Now that you're on the Namespace page, it's time to select your workgroup.

Step 5: Copy your Redshift endpoint

Now it's time to gather some information from the AWS console. Later, you'll use this to connect your Redshift warehouse to Starburst Galaxy using PrivateLink.

The first pieces of information you need to gather are the endpoint and subnet IDs for your Redshift warehouse. Let's start by copying the Redshift endpoint, located in the General information section.

Step 6: Copy the subnets from your Workgroup

Now it's time to copy the subnet IDs from your Workgroup.

These are located in the Data access tab in the Network and security section.

Step 7: Obtain the IP addresses of your Redshift endpoint

Now that you have your Redshift endpoint recorded, you can use it to find its IP addresses. For Redshift Serverless, you will have at least two IP addresses.

You'll be using a terminal window to find the IP addresses. Again, you will be copying information into your text editor.

Step 8: Navigate to Subnets

Now that you've obtained the IP addresses of your Redshift endpoint, you can use them to determine the availability zones. You will need this information when you create a load balancer later in this tutorial.

To do this, you're going to start by navigating to the Subnets menu, which is accessed through the VPC dashboard.

Step 9: Determine the availability zones

Now it's time to use your Redshift subnets to determine the corresponding Availability Zones.

Background

Redshift provisioned clusters enable users to create fully-managed data warehousing environments with customizable configurations using the AWS cloud platform.

With provisioned clusters, users have full control over cluster configuration and are responsible for managing the infrastructure, including scaling the cluster up or down based on workload demands.

Step 1: Sign in to AWS console

You're going to start by signing in to your AWS console.

Remember that this should be the AWS account containing the Redshift warehouse that you would like to connect using PrivateLink, so if you use multiple AWS accounts, ensure that you pick the correct one.

Step 2: Select Provisioned clusters dashboard

Next, you're going to enter the Redshift provisioned clusters dashboard section of AWS. This can be accessed via the left-hand navigation bar.

Step 3: Select Clusters menu

Now it's time to view specific provisioned clusters. AWS includes a specific section for this in the left-hand navigation menu.

Step 4: Select your cluster

Next, it's time to select your cluster from the Clusters menu. If you have multiple clusters, make sure to select the correct one.

Step 5: Record cluster endpoint

Now it's time to record your cluster's endpoint. Later, this will be used to connect the Redshift cluster to Starburst Galaxy.

Step 6: Record cluster availability zone

Now it's time to copy your cluster's availability zone and paste it into a text editor. To do this, you're going to access the Network and security settings section located under the properties tab.

Step 7: Obtain the IP address of your Redshift endpoint

Now that you have your Redshift endpoint recorded, you can use it to find its IP address.

You'll be using a terminal window to do so. Again, you will be copying information into your text editor.

Background

Now it's time to set up a target group. A target group is responsible for routing incoming traffic from a load balancer to registered targets, which are typically instances, containers, or IP addresses.

The target group you create in this tutorial will route traffic to the IP address of your Redshift endpoint.

Step 1: Start the target group wizard

The AWS console includes a target group creation wizard. This allows you to quickly and easily create target groups. It is accessed using the EC2 dashboard.

Step 2: Provide a target group name

Now it's time to configure your new target group. AWS will ask you to select a target type and provide a meaningful name.

Step 3: Configure target group

Next, you're going to configure your target group for use with your Redshift cluster. To do this, you're going to use some of the details that you copied into your text editor earlier in this tutorial.

Step 4: Complete configuration process

For the final step, you're going to finish the configuration process and create the target group.

Background

Now it's time to create a network load balancer. In AWS, a Network Load Balancer (NLB) is a service that automatically distributes incoming network traffic across multiple targets based on IP protocol data. This includes Amazon EC2 instances, containers, and IP addresses. Load balancers are also configurable across either a single or AWS Availability Zone or multiple Availability Zones.

After configuring PrivateLink, an endpoint in the Starburst Galaxy VPC will connect to your Network Load Balancer using a service located in the Redshift cluster VPC.

Step 1: Start the load balancer wizard

Just like target groups, AWS includes a load balancer wizard to help make the creation of load balancers easy. Again, this is located in the EC2 dashboard.

Step 2: Select load balancer type

AWS load balancers come in several different types. These include Application Load Balancers, Network Load Balancers, and Gateway Load Balancers.

For this tutorial, you're going to select the Network Load Balancer.

Step 3: Name your load balancer

It's time to start configuring your new load balancer, starting with a name.

Step 4: Configure the load balancer

Next, you're going to configure your load balancer for use with your Redshift cluster.

Step 5: Select the availability zone and subnet(s)

Now it's time to select AWS availability zones (AZ) for your load balancer. These will be the same AZs that you recorded for your Redshift deployment earlier in this tutorial.

Step 6: Configure security group

Step 7: Configure port and target group

Now it's time to finish configuring your target group.

Step 8: Wait for load balancer to activate

That's it! Your load balancer is now being created. This process takes between three to five minutes.

Background

Now it's time to create an endpoint service.

In the context of AWS PrivateLink, an endpoint service allows you to expose services running in your VPC to other accounts within the same AWS region using a private connection.

Step 1: Start the endpoint service wizard

Like target groups and load balancers, AWS allows you to create an endpoint service using a wizard.

Step 2: Provide an endpoint service name

Begin by naming your service endpoint and choosing the load balancer type.

Step 3: Configure endpoint service

Now it's time to configure your endpoint service. You're going to make sure that it connects with your network load balancer and uses the correct IP address.

Background

Time to switch gears. You've completed all of the steps required on your own. Now it's time to contact the Starburst support team to finish the last steps.

Step 1: Enter the Starburst Galaxy ARN

In the last section of this tutorial, you created your endpoint service. At the end of that process, you were directed to a page that displays the details of that service.

You're going to use this section to input the Starburst Galaxy Amazon Resource Name (ARN).

Step 2: Record Service name

Next, you will locate and copy the service name for your endpoint service. The Starburst support team will use this information to create the endpoint in Starburst Galaxy.

Step 3: Open support ticket

You are going to use the automated assistant in Starburst Galaxy to open a support ticket and provide support with the Service name that you just copied. You will also need to provide the port your database is listening on and your preferred Starburst Galaxy PrivateLink configuration name.

Step 4: Select the Starburst Galaxy endpoint

Do not begin this step until you receive confirmation from Starburst support that the Starburst Galaxy endpoint has been created successfully.

Step 5: Accept the endpoint connection request

Now that you've selected the Starburst Galaxy endpoint, it's time to accept the connection request.

Step 6: Confirm endpoint connection

That's it. The connection is now being created. This process takes between 1 to 3 minutes to complete.

When this process is complete, you are finished and ready to start using PrivateLink.

Tutorial complete

Congratulations! You have reached the end of this tutorial, and the end of this stage of your journey.

You're all set! Now you can use PrivateLink to configure access to data in your Redshift deployment.

Continuous learning

At Starburst, we believe in continuous learning. This tutorial provides the foundation for further training available on this platform, and you can return to it as many times as you like. Future tutorials will make use of the concepts used here.

Next steps

Starburst has lots of other tutorials to help you get up and running quickly. Each one breaks down an individual problem and guides you to a solution using a step-by-step approach to learning.

Tutorials available

Visit the Tutorials section to view the full list of tutorials and keep moving forward on your journey!

Start Free with
Starburst Galaxy

Up to $500 in usage credits included

  • Query your data lake fast with Starburst's best-in-class MPP SQL query engine
  • Get up and running in less than 5 minutes
  • Easily deploy clusters in AWS, Azure and Google Cloud
For more deployment options:
Download Starburst Enterprise

Please fill in all required fields and ensure you are using a valid email address.