The Starburst Enterprise 393-e LTS release provides Starburst customers with exciting new capabilities alongside more advanced connectivity, improved performance, and enhanced security. As always, this major release combines features that have been contributed back to the open source Trino project, as well as being curated for Starburst Enterprise customers.
This major, quarterly, release includes the latest batch of features for our customers. There’s never been a better time for existing customers to upgrade their cluster, and new prospects to start their journey with Starburst.
To experience this latest release first hand, please visit our download site.
Data lakes enable the implementation of a wide range of solutions, including raw data collection, flexible data access for users, and building fast and efficient data ware/lake-houses. From a data and analytics perspective, data lake solutions can act as a data staging ground, to transform raw data into a format for data analysis and reporting; and operate as something closer to a data warehouse with a built in query engine.
Open table formats like Apache Iceberg and Delta Lake allow users to interact with the data lake as easily as you would a database using SQL. Coupled with Hive, these open table formats allow for more analytics to be served out of the lake and reduce the need for data movement/migration which provides substantial cost savings.
Starburst’s power to provide advanced analytics on a data lake has been extended by adding support for native data management and data manipulation language on Iceberg. These improvements provide enhanced support for SQL functions, enabling true data warehousing analytics on a data lake.
V2 of the Iceberg specification supports many new useful capabilities typically associated with ‘data warehouse analytics’. This means full CRUD (create, read, update, delete) capabilities. This release enables ‘merge on read’ which allows users to fully make use of CRUD on a more granular row-level detail.
Furthermore, this release now supports ‘time travel’ queries. Time travel queries the contents of a table as the data existed at an arbitrary point in time. The functionality could be crucial for disaster recovery, or responding to regulatory audit requests, and providing test environments for developers.
Delta Lake also enables data modification and optimizations in data lakes. This release adds MERGE support, the ability to make comments, rename tables, and add columns. We continue to improve the performance and optimizations of our Delta Lake connector.
Our engineering team is constantly improving our engine performance. This could be connectors and client performance, across the board platform improvements, or individual optimizations to SQL functions and JOINS.
We’re proud that query response times in 393-e LTS shows an overall performance increase between 8-13% since our last LTS (380-e). That means faster queries, less compute costs, and faster time to insight for customers.
This also marks the first LTS release that requires Java 17, which lays the foundation for better performance, efficiency, and an improved codebase for our engineers to work with. This upgrade allows us to build new features that Java 11 would not be able to support, so we can consider the latest and greatest technologies when designing new features.
Starburst continues to stay at the forefront of innovation to stay ahead of the curve to efficiently deliver products and services to best support our customers. Upgrading to Java 17 lays the foundation for us to build into the future.
For existing customers, we have a step by step migration guide for upgrading to this breaking change.
Moving along to security. Starburst and our offerings adhere to strict security standards. While we care deeply about fast access to data, we believe more importantly in secure access to the right data. The confidentiality, integrity, and availability of data is fundamental to Starburst operations. By integrating security into the development process, we establish a single point of secure access to enterprise data.
Built-in access control was announced as generally available in our last LTS, and we’ve continued to add more mature functionality. This release built-in access control has added support for explicit deny policies. A typical use case might be allowing access to ‘almost all’ tables in a schema. Complementing the DENY role privilege is the administrative functionality to select “All Roles”. This will help with more wide ranging security updates and consistency is user access.
Built in access control requires a first class user experience. This release improves entity selection, and the ability to select single or all roles, in addition to specific table functions. Another crucial capability for our native security is tracking and providing relevant data for audits. The Audit log is now generally available in this release and provides basic filtering, larger row limit, and ability to download.
Experience / FTE
We recently announced the public preview for query fault tolerance execution in both Starburst Galaxy and Starburst Enterprise. These capabilities enable use cases on the lakehouse that include, building large rollup tables, preparing datasets for machine learning models, and wrangling data that feed into data applications.
Trino is able to achieve incredibly fast speeds by prioritizing in-memory execution. Data engineers’ time is valuable, so we focus on letting data engineers write business logic at the speed of thought. By taking advantage of standard SQL dialect between interactive and ETL analytics, engineers no longer need to learn different SQL dialects depending on the size of the job.
Data engineers can iteratively test SQL queries as they develop complex data pipelines because the coordinator is always up and waiting for a query.
Starburst customers now have a super fast and easy-to-use solution for both interactive and longer-running data pipeline queries. Fault-tolerant execution runs data pipeline queries more intelligently by letting you reliably run much larger data pipeline queries, save costs by running non-latency-sensitive queries on much smaller clusters, and execute more queries concurrently.
What’s currently supported is task & query retry to ensure query completion, an exchange manager that stores query checkpoint if a node were to fail, and dynamic filtering. Guaranteeing query completion means our customers can have confidence using Starburst to take on resource-intensive workloads from long-running queries.
Often overlooked in our LTS releases, is the incremental capabilities added to our existing connectors themselves. Let’s be clear, the majority of Starburst connectivity is far more than merely reading data. The enhancements to connectors mean more support for SQL functions that improve performance and enable data transformations for more advanced analytics.
This release includes support for improved INSERT performance in MySQL, Oracle, PostgreSQL, and Amazon Redshift. It also includes functionality for full SQL passthrough in over 15 connectors. SQL passthrough provides users the flexibility to increase performance by pushing down an entire query to the source. This offers flexibility around what a user can push down to the source, resulting in improved query performance.
Previously our Snowflake connector only supported 1 catalog per database. This limited the usage and flexibility for customers with larger Snowflake deployments. With this release, our Snowflake connector now supports multiple databases within Snowflake so users can leverage Starburst and Snowflake together more easily, and completely.
Predicate pushdown, or pushing parts of the query, are now supported in Teradata and Druid. Our Google BigQuery connector now supports external tables and basic authentication has been added to Prometheus.
Clients / ecosystem
Metabase excels at providing BI insights to users without needing to write a single line of SQL, making the tool a perfect fit for the Trino query engine that connects you to all of your data, wherever it lives.
The Starburst partner driver for Metabase easily connects Metabase Cloud, On-Prem, or Open Source to your cluster. This lets you take advantage of Metabase’s data exploration tools with the scalability and performance of a world-class query engine. Metabase can be used to connect to Starburst Enterprise, Starburst Galaxy, and Trino, as a BI tool and client
Other notable improvements and enhancements to client integrations are with ThoughtSpot, Python, and dbt.
Our ThoughtSpot integration now includes the industry standard protocol for authentication with OAuth support. Now our customers don’t need to create a Service Account for Thoughtspot making it more accessible. The integration also works with Starburst Galaxy.
Next up, our Python client received an upgrade with various functionality developers have come to expect. These enhancements enable a better overall developer experience, and more functionality for uses in dbt, Airflow, and other ELT/ETL platforms.
This release also includes a new integration with dbt, a popular ETL/ELT tool. Now, data teams can use SQL to create production-ready data pipelines in dbt, which execute via Starburst for improved performance. This integration is available in both Trino and Starburst, allowing open source users to test drive the capabilities.
In summary, this release made strides to support customers on a Lakehouse analytics strategy with optionality on Iceberg and Delta Lake, we’ve increased overall performance, native security, and enhanced connectivity to data sources and clients.
And these are just the highlights! The full release notes detailing all of the features can be viewed here.
If you’re interested in hearing more, please register for our Analytics Anywhere: An Introduction to Starburst Enterprise webinar on September 22nd.