Cookie Notice
This site uses cookies for performance, analytics, personalization and advertising purposes.
For more information about how we use cookies please see our Cookie Policy.
Manage Consent Preferences
These cookies are essential in order to enable you to move around the website and use its features, such as accessing secure areas of the website.
These are analytics cookies that allow us to collect information about how visitors use a website, for instance which pages visitors go to most often, and if they get error messages from web pages. This helps us to improve the way the website works and allows us to test different ideas on the site.
These cookies allow our website to properly function and in particular will allow you to use its more personal features.
These cookies are used by third parties to build a profile of your interests and show you relevant adverts on other sites. You should check the relevant third party website for more information and how to opt out, as described below.
Showing 70 results
In our last post, we discussed two methods for running geospatial analysis with Trino and the Hive connector and explored a few optimization techniques...
The Trino open source distributed query engine is known as a choice for running ad-hoc analysis where there’s no need to model the data and...
More than any other industry, Financial Services is likely to only partially realize the elusive utopian state of 'the single source of truth' for...
We are eleven days into the new year, and I have spent the past two weeks exerting unreasonable amounts of effort trying to make...
Over the past few weeks, we’ve shared a few examples of what it means to be a data rebel. Hopefully you’ve recognized yourself in...
Most organizations have data and continue to generate and collect it on a daily basis, but have a far more difficult time in getting...
The shift to cloud-based software-as-a-service platforms is accelerating in just about every tech industry. So it wasn’t much of a surprise to the analytics...
As we’ve gone from Data Mesh theory to practice, organizations have been shifting their focus towards the central tenet of Data Mesh — building...
In this series, we demonstrate how to build data pipelines using dbt and Trino with data directly from your operational systems. They can use...
Last week in San Francisco was one for the Trino history books. After three years of planning, rescheduling, planning, and rescheduling some more, Starburst...
A data lakehouse combines the principles of a data lake and a data warehouse to include the best of both worlds. Data lakehouses are...
I have been in and around data since my days with Microsoft Access, Excel, and SQL Server circa 2000, and was fortunate to witness...
It’s finally here! We are closing in on the final countdown to Trino Summit 2022, and I can feel myself getting more excited with...
Since my first introduction to dbt, I was intrigued to say the least. Working as a data engineer, I was attempting to manage complicated...
Since Datanova: The Data Mesh Summit and our in-person executive discussions on data products and Data Mesh, we’ve been validating the data product approach...
Starburst has played a key role in the Trino community for a long time now. We contribute to the success of Trino every day....
AWS S3 has become one of the most widely used storage platforms in the world. Companies store a variety of data on S3 from...
Data virtualization revolutionized the data infrastructure space by serving data consumers directly on top of data stores, without the need to move data elsewhere....
Since Datanova: The Data Mesh Summit and our in-person executive discussions on data products and Data Mesh, we’ve been validating the data product approach...
Data indexing radically accelerates query run time and concurrency without the need for massive compute resources. But before expecting indexing to solve all your...
Since Datanova: The Data Mesh Summit and our in-person executive discussions on data products and Data Mesh, we’ve been validating the data product approach...
It is quite popular in today's data climate for modern data architectures to have some sort of batch processing system to move data into...
Customers who want a single, super fast and easy-to-use solution for both interactive and longer-running data pipeline queries now have a solution: take advantage...
Mission 2 Wrap and Mission 3 Launch We all know at least one pandemic puzzler, a devoted crossworder, or a religious wordler who finds...
Before I joined Starburst, I worked in the AdTech industry where companies buy and sell user data for online targeting advertisement campaigns or ML/AI-based...
Current State of ETL/ELT Extract-transform-load, more commonly known by its street name “ETL”, has been around since the early days of computing. Bringing together...
Recently, I had the pleasure of chatting with Ravit Jain on his show “The Ravit Show” to discuss the evolution of Trino and where...
This is Part 2 of a 2-part blog about how Trino can support both interactive and batch use cases. In Part 1, we explored...
This is Part 1 of a 2-part blog about how Trino can support both interactive and batch use cases. In Part 1, we will...
Calling all data pros! Are you ready for a $20k payday? Yes, you heard it right – you could be walking away with $20,000...
A key engineering responsibility at Starburst is on performance enhancements. One is to reduce the amount of time that a CPU has to work...
Data Fabric and Data Mesh continue to sustain legions of hype and debate. Data and analytics leaders are longing for a new roadmap, beyond...
Nod with me if you’ve suffered from the following problems with processing and analyzing Big Data via a centralized approach: different query languages, niche...
So far, we’ve highlighted a few reasons why you should attend Datanova: The Data Mesh Summit: The Woz and Justin Borgman. The next reason...
Summary Use the right tool for the right job. Not doing so means the difference between your Tableau viz rendering in seconds vs. minutes...
Over the past twenty or so years, companies have experienced a Cambrian explosion of where their customer data resides.Cloud and on-premises enterprise applications aim...
As companies shift their analytical ecosystems from on-premise to cloud and try to avoid “data lock-in”, we’re noticing some very interesting data patterns. This...
Over the past few years the “modern data stack” has entered the vernacular of the data world, describing a standardized, cloud-based data and analytics...
I’m one of those strange people who has always enjoyed doing performance testing. The thought of spinning up lots of machines to do my...
Data Mesh is based on four central concepts, the second of which is data as a product. In this blog, we’ll explore what that...
Insane in the domain! Insane in the brain! Crazy insane, got no domain! - Cypress Hill, sort of Data Mesh is based on four...
Today’s digital world is an expanding frontier of emerging technologies. There are endless innovations, inspired by data, informed by data, enabled by data, and...
By leveraging Starburst, Assurance was able to improve conversion rates, reduce costs, and enable robust modeling. Read the full case study here. ...
My fascination with SQL query performance started quite some time ago and I contributed a paper on efficient processing of data warehousing during my...
As companies shift their analytical ecosystems from on-premise to cloud and try to avoid “data lock-in”, we’re noticing some very interesting data patterns. This...
Kafka was created at LinkedIn and open sourced into the Apache Software foundation in early 2011. It was developed to optimize writes especially for...
The media and telecommunications provider now known as Comcast began as a regional operator with just five channels and 12,000 customers. Today, Comcast has...
Most companies want to follow good security practices. With the number of security breaches coming out daily, it almost feels like a matter of...
This is the fourth episode in our video series, Starburst Elements, focused around anything and everything Starburst. In this episode, our Product Manager Vishal...
This is the third episode in our video series, Starburst Elements, focused around anything and everything Starburst. In this episode, our Product Manager Vishal...
Note: I start this piece with some technical background that has nothing to do with the data mesh, and is only relevant to data...
TL;DR: The Hive connector is what you use in Starburst Enterprise for reading data from object storage that is organized according to the rules...
We love data engineers at Starburst. They are our people, even when their Starburst Data equivalents try to trick Marketing into pronouncing the data...
In this video, I walk you through the steps of migrating between an existing EMR Presto cluster that has existing data in S3 and...
In today’s data architecture economy, there are no shortages of options when it comes to choosing various distributions and deployment strategies for a given...
A few days ago I read a Gartner report stating that data scientists spend 23% of their time on data collection and preparation. I...
As you probably know, Starburst is one of the main contributors and sponsors of the Presto open source project and the community around Presto....
Our customer base has been growing quickly, and we’re excited to share a case study highlighting one of our largest clients, a telecommunications...
Article reposted from Medium with permission from the author, Ashish Singh | Pinterest Engineer, Data Engineering As a data-driven company, many critical business decisions...
What happened in 2019? 2019 was a big year for Starburst. Today we shared some of our major accomplishments: ...
We call ourselves the Presto experts here at Starburst Data, but what does that actually mean? ...
Starburst Presto 323e is the now our most exciting and feature rich release by Starburst to date. When we founded Starburst, our vision was to...
Kubernetes (K8s) eases the burden and complexity of configuring, deploying, managing, and monitoring containerized applications. We are excited to announce the availability and support...
Welcome to the Advanced SQL Features in Presto series. In this series you are going to cover a set of SQL features that expands...
With more and more companies using AWS for their many data processing and storage needs, it’s never been easier to query this data with...
Originally posted http://prestodb.rocks/news/presto-memory There is a highly efficient connector for Presto! It works by storing all data in memory on Presto Worker nodes, which...
Karol Sobczak, Co-founder and Software Engineer at Starburst Welcome back to the series of blog posts (checkout our previous post!) about Presto's first Cost-Based...
Wojciech Biela, Co-founder at Starburst Introduction As mentioned in our previous blog about the Starburst Presto release and its hottest addition - the Cost...
Next week, we will be releasing the Starburst Distribution of Presto 195e. Based on prestosql/presto 0.195, Starburst’s 195e will ship with Presto’s first cost-based...
As you may have learned from our first press release, we have announced the creation of Starburst, a new independent company solely focused on...
© Starburst Data, Inc. Starburst and Starburst Data are registered trademarks of Starburst Data, Inc. All rights reserved. Presto®, the Presto logo, Delta Lake, and the Delta Lake logo are trademarks of LF Projects, LLC
Up to $500 in usage credits included