Cookie Notice
This site uses cookies for performance, analytics, personalization and advertising purposes.
For more information about how we use cookies please see our Cookie Policy.
Manage Consent Preferences
These cookies are essential in order to enable you to move around the website and use its features, such as accessing secure areas of the website.
These are analytics cookies that allow us to collect information about how visitors use a website, for instance which pages visitors go to most often, and if they get error messages from web pages. This helps us to improve the way the website works and allows us to test different ideas on the site.
These cookies allow our website to properly function and in particular will allow you to use its more personal features.
These cookies are used by third parties to build a profile of your interests and show you relevant adverts on other sites. You should check the relevant third party website for more information and how to opt out, as described below.
Trino is an open-source distributed SQL query engine designed to query large data sets distributed over one or more heterogeneous data sources.
Many times, this array of words can be confusing for users of Trino despite the useful delineation from database management systems and other tools that use a SQL interface to query data. Because of the similarity in the use of the SQL language to query Trino and databases, many users make certain assumptions about the software and its applications. For instance, they expect that Trino should be able to support insertions and deletions across all of the data sources it connects to. While these features are certainly useful and do make it into Trino for some data sources, the general stance is that Trino is first and foremost aiming to support OLAP (Online Analytical Processing) use cases.
The most simplistic way to summarize OLAP is that you generally focus on making reads faster by putting less emphasis on making the writes faster and durable. What durable means in this sense is that you have clear expectations around the state of the various tables in your database which are imposed through schemas, constraints, and transactions. This is how OLAP contrasts to OLTP (Online Transactional Processing) systems in that it doesn’t support ACID transactions. It may further confuse you to see that some connectors, like the Hive connector, do in fact support some notion of ACID transaction. This is, for now, the exception but not the rule, and there are some limitations to the type of INSERT, UPDATE, and DELETE operations you can do with these systems. In general, data used for OLAP uses, should be snapshots of operational data over time and should not be mutable. Therefore when evaluating when to use Trino, it’s a safe bet that if you’re reading out of a data lake or running federated queries across different data, you will likely want to use Trino. If you are writing to other data sources, you will need to analyze what your requirements are and what the capabilities of your connectors you wish to use are. Trino does a wonderful job at speeding up ETL jobs provided it is copying and transforming data and not merging it on insertion.
To summarize, there are many applications for OLAP but here are a few common cases where OLAP is used:
Get your free copy of Trino: The Definitive Guide from O'Reilly
Download the ebook© Starburst Data, Inc. Starburst and Starburst Data are registered trademarks of Starburst Data, Inc. All rights reserved. Presto®, the Presto logo, Delta Lake, and the Delta Lake logo are trademarks of LF Projects, LLC
Up to $500 in usage credits included