×

The choice is yours: Open source Trino and Starburst Galaxy

By: Mike Marolda
August 9, 2022
Share: Linked In

A few months back when Starburst Galaxy launched on AWS, Google Cloud, and Azure, I wrote a blog on What Fully-Managed Means to Starburst. I talked about the different burdens eased by a fully-managed product like Starburst Galaxy running against the backdrop of a self-managed product like open source Trino. In the time since we launched the platform, that story has only grown to offload even more of the management burden and enable you to do more analytics faster with the best-in-class MPP SQL engine, Trino. So what are the differences today between Starburst Galaxy and open source Trino?

Open source Trino is a highly parallel and distributed ANSI SQL compliant query engine. The largest organizations in the world leverage Trino to query data lakes and massive data warehouses at petabyte scale with best-in-class performance. Trino’s connector-based architecture allows users to query and join distributed data across a wide variety of data sources without data movement. It supports an extensive array of use cases including interactive analytics and long-running ETL jobs.

Starburst Galaxy is an easy-to-use, fully-managed, enterprise-ready SaaS platform from the original creators of Trino. It automates many manual, and sometimes cumbersome, management headaches associated with self-managing Trino, and also includes security, performance, and connectivity enhancements to make the query engine ready for production environments. Support is provided by the largest team of Trino experts in the world.

Ease of Use

Trino is relatively easy to set-up, however getting to production-ready especially at scaling organizations can be difficult. There’s a bit of a ramp up period to understand how to deploy and use the project. New users have to install Trino on their servers on-premise or in the cloud. Knowledge of Kubernetes development is recommended. While deployment can be time consuming, there are a variety of deployment types designed to fit your data ecosystem. Queries are run via a CLI or the analytics tool of your choice.

Starburst Galaxy includes all the tools you need to get started fast using the best-in-class MPP SQL query engine, Trino. With an intuitive user experience, you can get started in minutes not weeks. Users connect to a data source, create a cluster, and start querying via SQL using the built-in query editor in just a few clicks or you can leverage the analytics tool of your choice. All you need to get started is access to the underlying data sources.

Starburst-Create-New-Cluster

Management

In order to use Trino, you need to install software within your architecture of choice. The process of creating a new cluster requires you to dive into the configuration settings. Autoscaling capabilities require deep technical knowledge of Trino, and the underlying systems. Tuning your clusters to your organization’s preferred cost to performance ratio can be an extremely time consuming task, and modifying a cluster means you’ll have to redeploy it with your required settings. This makes upgrading to a new version of Trino a similarly laborious task.

On the other hand, Starburst Galaxy provides you with all the tools you need to automatically manage your environment. We’ve simplified cluster creation to a few simple clicks. You can choose your desired cluster size from a range of t-shirt options (x-small, small, medium, large, x-large, xx-large), or create a custom size with autoscaling enabled (currently available in public preview). Enabling idle shutdown will automatically shutdown a cluster when not in use saving you both time and resources; therefore, creating cost savings by suspending a cluster that is not in use. Configuring a cluster for fault tolerance to handle long-running workloads is as simple as clicking a button. With blue/green deployments, you can modify your cluster to adjust the size without the hassle of redeploying it or stopping running workloads. And finally, you’re always on the latest version of Trino with Starburst Galaxy as we seamlessly update when the latest version is ready.  This means no disruptions or downtime associated with software upgrades.

Starburst Galaxy Diagram

Connectivity

Trino has a wide array of 25+ connectors to data lakes, data warehouses, relational, non-relational, streaming, and key-value stores. Leveraging these connectors requires knowledge of the individual connector property files and management of their repos. Testing connector properties requires manually restarting the clusters and observing the logs.

Starburst Galaxy comes out of the box with pre-configured highly performant connectors to the most popular data lakes and cloud data warehouses.  Starburst’s catalog of fully-managed connectors is always expanding, with recent releases that include connectivity with Snowflake, Azure Synapse, and Amazon Redshift. If you plan on running a data lake or lakehouse architecture, Starburst’s Great Lakes connectivity allows you to query open table formats like Iceberg, Delta Lake, and Hive on the object store of your choice with no additional configuration details necessary. And finally, we’ve greatly simplified creating a new data catalog enabling you to query a new data source with only a few clicks. You can even test a new connection with a simple click, no need to restart clusters or observe logs.

Support

With Trino’s open source community, you have access to many frequently asked questions and a network of contributors. Trino is a growing, active community, and provides the necessary documentation and online community support, to enable many Trino users to self-manage their own deployment.

The reality is not every organization has the resources and skills to stand up a production environment of Trino. They look for domain expertise and dedicated support for using Trino as an analytics engine on their most critical workloads. At Starburst, we are the Trino experts. In fact, the creators of the Trino project are also the creators of Starburst Galaxy, which has been designed from the ground up for the Trino community. We also have a deep knowledge base of support with over 1 million combined hours supporting Trino deployments at the largest companies in the world. We’re here to make your experience with Trino successful.

Give it a try

Starburst Galaxy is the only option for a fully-managed enterprise-ready version of Trino. Over the past few months, the platform has added new connectivity options (Amazon Redshift, Azure Synapse, and Snowflake) with more on the way. We’ve added the ability to query popular open table formats like Delta Lake, Iceberg, and Hive via Great Lakes connectivity. On the security side, we’ve improved our native role-based access control to enable customers to control who has access to data down to the table-level. We’ve also added support for single sign-on with Okta and Azure AD (currently in private preview), as well as achieved SOC 2 Type 2 compliance. Finally, on the management front, we’ve enabled fault tolerant execution on a cluster with a single-click enabling you to perform mission critical data transformations on your data lakehouse without your queries failing as well as autoscaling to support any scaling organization (both currently in preview).

The best part, Starburst Galaxy only takes a few minutes to get started. Give it a try Starburst Galaxy is available for you to try for free with $500 in free usage credits. Sign-up today!

Mike Marolda

Senior Product Marketing Manager, Starburst

Start Free with
Starburst Galaxy

Up to $500 in usage credits included

  • Query your data lake fast with Starburst's best-in-class MPP SQL query engine
  • Get up and running in less than 5 minutes
  • Easily deploy clusters in AWS, Azure and Google Cloud
For more deployment options:
Download Starburst Enterprise

Please fill in all required fields and ensure you are using a valid email address.