Introducing Enhancements to Starburst Galaxy’s Autoscaler

Starburst Galaxy's autoscaler now considers several new metrics for proactive and cost-effective autoscaling

Published: May 14, 2024

Since its launch, Starburst Galaxy has provided customers with the automatic scaling of compute resources, ensuring optimal performance for varying workloads.

Today we launched several enhancements for cluster autoscaling in Starburst Galaxy, now available in private preview. Key metrics now factored into the autoscaling process include:

  • CPU Load
  • Estimated runtime 
  • Queue length 
  • Insights from completed queries

These improvements aim to address a wider range of workloads and provision resources more efficiently, resulting in faster query execution times without increasing costs. In this blog, we’ll delve deeper into these enhancements, and we’ll also review the results of a real-world customer test comparing the legacy and enhanced autoscalers.

A proactive approach to autoscaling 

Previously, if CPU utilization reached or exceeded 60%, the cluster would scale up to its maximum number of allowed workers. However, because automatic resource scaling was triggered solely based on CPU utilization, workloads constrained by other factors didn’t receive additional resources as quickly as needed, if at all.

In the image below, you can visualize the new proactive autoscaling behavior in Galaxy. In the top graph, CPU usage peaks at around 40%. At this peak, you can observe in the bottom graph that additional workers are added without CPU consumption hitting the legacy 60% threshold.

Smarter resource allocation

When workloads require additional capacity but don’t receive it due to constraints on resources other than CPU or delays in activating the additional resources, customers encounter slow query response times or failures. To mitigate these issues, a common practice is to overcommit compute resources, leading to increased costs. 

With the improved autoscaler, Galaxy estimates computation time based on a broader set of metrics, enabling it to cater to a wider range of workloads effectively. Additionally, the autoscaling decision is now made earlier and more quickly, typically within two minutes, compared to at least four minutes previously for large queries. These enhancements guarantee faster query execution and eliminate the need for manual adjustment of resources.

The results are in

In a recent customer test, we evaluated the performance of Starburst Galaxy’s enhanced autoscaler against the legacy autoscaler across various workload sizes. The results speak for themselves. For example, with schema sf10000, query execution times saw a reduction from 5.86 to 4.52 minutes, demonstrating significant improvements in performance. Scaling up to a larger workload with schema sf100000, the enhancements were even more pronounced, with times dropping from 24.35 to 13.80 minutes.

Getting started with autoscaling in Galaxy is as simple as creating your free account today. 


Start Free with
Starburst Galaxy

Up to $500 in usage credits included

  • Query your data lake fast with Starburst's best-in-class MPP SQL query engine
  • Get up and running in less than 5 minutes
  • Easily deploy clusters in AWS, Azure and Google Cloud
For more deployment options:
Download Starburst Enterprise

Please fill in all required fields and ensure you are using a valid email address.