Presto just turned eight years old only a few weeks ago, and it’s just getting started.
Let’s take a journey back to 2012. Facebook had ramped up its data warehouse to intake massive amounts of data to the tune of 250 petabytes. For those of you that don’t know the story, Hive was the current data warehousing software, also grown out of Facebook, that enabled users to run batch processing jobs over Hadoop. The jobs were submitted using a Hive-specific SQL dialect and didn’t require knowing how to program using the complex MapReduce paradigm. Running SQL on the Hadoop cluster was revolutionary in the early Hadoop era, where adopting Hadoop initially had high costs to hire experts to access the data in HDFS. However, Hive was not designed with a human-in-the-loop in mind and had other drawbacks such as a nonstandard SQL dialect. The anecdote taken from a Facebook data scientist at the time was that running even six queries on Hive was a good day. Previous solutions failed to scale to address the lack of ability to sift through this data in real time and provide accurate results. Facebook vastly needed to improve the time-to-value for business analysts, developers, and data scientists.
Following the timely Facebook tradition, engineers Dain Sundstrom, David Phillips, Eric Hwang, and Martin Traverso set out to build an entirely new system that could handle the existing petabyte-scale of data at Facebook and keep up as it grew. As we know, these were the humble beginnings of the Presto system we all have come to know and love.
The design of Presto aims to return results fast and correctly, adhere to the ANSI SQL standards, and above all, make the system open-source and community-driven. Not only did the project achieve what it set out to do at Facebook, but it expanded with its flexible connector architecture. The alchemy of the connector SPI architecture, combined with the open-source culture, ultimately drove the success of Presto. The open-source philosophy provides plenty of benefits, including a collaborative testing suite and user ownership and influence on the project, which makes for better, robust software. This culture gives interested parties a stake in the direction of Presto and combines their story with the Presto legacy. What this means is that this birthday celebrates not only the accomplishments of Presto as a software but the accomplishments of the individual contributors and companies that helped get Presto to where it is today. This notion is reflected no better than one of the original co-creators of Presto, Martin Traverso.
With that, we here at Starburst want to send a heartfelt thanks to everyone who has contributed and look forward to the next eight years of success with Presto.
What are some next steps you can take?
Below are three ways you can continue your journey to accelerate data access at your company
- 1
- 2
Automate the Icehouse: Our fully-managed open lakehouse platform
- 3
Follow us on YouTube, LinkedIn, and X(Twitter).