Cookie Notice
This site uses cookies for performance, analytics, personalization and advertising purposes.
For more information about how we use cookies please see our Cookie Policy.
Manage Consent Preferences
These cookies are essential in order to enable you to move around the website and use its features, such as accessing secure areas of the website.
These are analytics cookies that allow us to collect information about how visitors use a website, for instance which pages visitors go to most often, and if they get error messages from web pages. This helps us to improve the way the website works and allows us to test different ideas on the site.
These cookies allow our website to properly function and in particular will allow you to use its more personal features.
These cookies are used by third parties to build a profile of your interests and show you relevant adverts on other sites. You should check the relevant third party website for more information and how to opt out, as described below.
A data lake is a single store of data that can include structured data from relational databases, semi-structured data and unstructured data. It can include raw copies of data from source systems, sensor data, social data and more. The structure of the data is not typically defined when the data is captured. Data is typically dumped into a data lake without much thought about accessing it.
Trino (formerly known as PrestoSQL) was built by four Facebook engineers to address performance, scalability and extensibility needs for analytics at Facebook. Trino is a distributed SQL query engine designed for efficient, low latency analytics at scale. It emerged from Facebook as a faster and more powerful way to query a very large Hadoop data warehouse than what Hive and other tools could provide. Modern data lakes often use other object storage systems beyond HDFS from cloud providers including Amazon Simple Storage Service (S3), Google Cloud Storage and Microsoft’s Azure Blob Storage. By leveraging connectors to these cloud object stores, Trino is able to query these systems and enable high performance SQL analytics on your data lake no matter where it’s located or however it stores the data.
Trino has become the choice for querying the data lake due to its high performance at scale. Unlike other options available today, Trino’s concurrency is limited only by the size of your cluster which can be scaled up and down as required. Trino also has connectors to the most popular data sources allowing for data federation across multiple data sources, providing the user with a holistic view of their entire data ecosystem. These connectors allow you query the data where it resides, shortening the data pipeline for your organization.
To learn more about Trino, check out our Trino FAQ page.
Get your free copy of Trino: The Definitive Guide from O'Reilly
Download Now© Starburst Data, Inc. Starburst and Starburst Data are registered trademarks of Starburst Data, Inc. All rights reserved. Presto®, the Presto logo, Delta Lake, and the Delta Lake logo are trademarks of LF Projects, LLC
Up to $500 in usage credits included