We are thrilled to announce the release of Starburst Enterprise 429-e LTS (Long-Term Support). This release encompasses a wealth of new features and improvements, each designed to enhance the capabilities, efficiency, and user experience of our platform. Let’s dive into the significant features of this release, their use cases, and the benefits they bring to your data-driven initiatives.
PyStarburst: Bridging SQL analytics and Python
PyStarburst serves as an interface for Python applications to interact directly with Starburst and Trino’s SQL engines. It allows Python scripts and notebooks to execute SQL queries through Starburst, leveraging its distributed SQL engine. This integration provides a native Python experience while accessing data across diverse sources supported by Starburst and Trino, such as Hadoop, S3, or Kafka.
PyStarburst bridges the gap between Python’s data processing capabilities and SQL’s data querying strengths. This allows data scientists and analysts to work within the Python ecosystem while harnessing the power of Starburst’s distributed SQL query engine. By enabling direct SQL execution within Python environments, PyStarburst reduces the need to switch between different tools or languages for data querying and processing tasks. This results in more streamlined workflows and greater productivity.
Users can now combine Python’s extensive libraries, such as Pandas for data manipulation or Matplotlib for data visualization, with powerful SQL analytics. This creates opportunities for more sophisticated data analysis and visualization that leverages the best of both worlds. Data science workflows often involve a combination of data extraction, transformation, and analysis. PyStarburst simplifies these workflows by providing a unified platform for both data extraction (using SQL queries) and subsequent data processing and analysis in Python.
Read more about using PyStarburst in the documentation.
Unity Catalog: Metastore enhancement
Unity Catalog support offers a read-only public preview as a metastore for Delta Lake, aimed at providing a consolidated view of data across various sources. As a metastore, it acts like a grand library of data, offering a unified view across diverse data sources. This support is particularly useful for organizations seeking to simplify their data governance and enhance data discovery and management. This enhancement is specifically for managed tables in Delta Lake through the Delta Lake connector.
Read more in the Starburst Delta Lake connector documentation.
Credential vending for AWS Lake Formation: Secure, seamless access
Navigating the waters of data security and access can be daunting. With the new credential vending feature for AWS Lake Formation, we’re handing out secure keys to access data lakes. This feature is a beacon of security and efficiency, ensuring that data remains both protected and accessible to authorized personnel. This update is crucial for organizations that use AWS Lake Formation, as it simplifies access management while ensuring data security.
Credential vending is a mechanism that dynamically provides temporary, limited-privilege credentials for accessing AWS Lake Formation resources. This approach eliminates the need for hard-coded credentials, reducing the risk of credential exposure and enhancing overall security. When a user or application requests access to a data resource, Starburst Enterprise interacts with AWS Lake Formation to obtain temporary credentials specifically for that access request. The credentials provided are tightly scoped in terms of permissions and duration, aligning with the principle of least privilege.
Credential vending allows for dynamic and conditional access control based on user roles, contexts, and specific access requirements. This ensures that data access is granted based on current needs and policies, enhancing the overall security posture.
Read more about AWS Lake Formation credential vending in the documentation.
Delta Lake enhancements: Flexible data management
Inevitably, structures and schemas evolve. The support for CREATE OR REPLACE TABLE statements in Delta Lake offers more flexibility in managing and evolving data schemas. This feature is beneficial for those who need to adapt their data structures quickly to meet changing requirements.
The CREATE OR REPLACE TABLE statement in Delta Lake allows for the modification of existing tables without the need to drop and recreate them. This simplifies schema evolution and data structure changes. This feature is particularly beneficial in environments where data models and schemas are subject to frequent updates due to evolving business requirements or data sources. It ensures that changes to table structures, such as adding or modifying columns, can be implemented more efficiently and with less risk of data loss or downtime.
It is critical in maintaining the integrity of versioned data in Delta Lake, where table modifications need to be tracked and managed effectively.
MongoDB connector improvements: Enhanced query performance
Improvements to the MongoDB connector, including predicate pushdown support, enhance the efficiency of queries. This update is particularly valuable for those working with large MongoDB databases, as it optimizes query performance and reduces data transfer loads. Predicate pushdown can significantly improve query performance and reduce data transfer between MongoDB and Starburst, leading to cost savings and efficiency.
Predicate pushdown is a query optimization technique where Starburst Enterprise offloads the filtering part of a query (the predicates) to the data source (MongoDB). Instead of pulling all data from MongoDB and then filtering it within Starburst, the query engine sends the filtering logic to MongoDB. MongoDB then applies this logic directly and only returns the relevant subset of data. This process can significantly reduce the amount of data transferred over the network from MongoDB to Starburst, leading to more efficient use of network resources and faster query execution.
Farewell to Snowflake distributed connector and towards a parallel future
The deprecation of the Snowflake distributed connector, in favor of the improved Snowflake parallel connector, marks a transition towards more efficient and scalable data integration. Users of the Snowflake distributed connector will need to migrate to the parallel connector, which offers enhanced performance and scalability.
Keeping up-to-date with modern Java
As part of keeping our product development efficient and our software secure, we routinely audit our code dependencies and libraries for necessary updates. The Starburst Enterprise 429-e LTS release is planned to be the last quarterly LTS that uses Java 17, with the upcoming February LTS expected to require Java 21. This is an advance notice of an upcoming change, no action is needed at this time.
Our Starburst documentation team has been hard at work improving the experience of reading about our products. We recently consolidated our Starburst Enterprise content for better navigation and discovery of our product content, such as the new administration topics section that includes useful guides for platform administrators who want to follow best practices when deploying and configuring their cluster.
We have more enhancements to come over the next two quarters, so keep an eye on the Starburst documentation for future improvements.
The Starburst Enterprise 429-e LTS release marks a significant step forward in our journey to provide the most advanced and user-friendly data analytics platform. These features not only enhance the technical capabilities of Starburst Enterprise but also provide tangible business benefits, from improved security and governance to increased efficiency and performance. Embrace the power of Starburst Enterprise 429-e LTS to transform your data analytics and engineering endeavors into a competitive advantage.
Read more about all of the features and enhancements in the 429-e release notes.