Fully managed in the cloudStarburst GalaxySelf-managed anywhereStarburst Enterprise
- Start Free
Fully managed in the cloud
Trino is an open-source distributed SQL query engine designed to query large data sets distributed over one or more heterogeneous data sources.
Many times, this array of words can be confusing for users of Trino despite the useful delineation from database management systems and other tools that use a SQL interface to query data. Because of the similarity in the use of the SQL language to query Trino and databases, many users make certain assumptions about the software and its applications. For instance, they expect that Trino should be able to support insertions and deletions across all of the data sources it connects to. While these features are certainly useful and do make it into Trino for some data sources, the general stance is that Trino is first and foremost aiming to support OLAP (Online Analytical Processing) use cases.
The most simplistic way to summarize OLAP is that you generally focus on making reads faster by putting less emphasis on making the writes faster and durable. What durable means in this sense is that you have clear expectations around the state of the various tables in your database which are imposed through schemas, constraints, and transactions. This is how OLAP contrasts to OLTP (Online Transactional Processing) systems in that it doesn’t support ACID transactions. It may further confuse you to see that some connectors, like the Hive connector, do in fact support some notion of ACID transaction. This is, for now, the exception but not the rule, and there are some limitations to the type of INSERT, UPDATE, and DELETE operations you can do with these systems. In general, data used for OLAP uses, should be snapshots of operational data over time and should not be mutable. Therefore when evaluating when to use Trino, it’s a safe bet that if you’re reading out of a data lake or running federated queries across different data, you will likely want to use Trino. If you are writing to other data sources, you will need to analyze what your requirements are and what the capabilities of your connectors you wish to use are. Trino does a wonderful job at speeding up ETL jobs provided it is copying and transforming data and not merging it on insertion.
To summarize, there are many applications for OLAP but here are a few common cases where OLAP is used:
Up to $500 in usage credits included