×
×

Trino OLAP

Trino is an open-source distributed SQL query engine designed to query large data sets distributed over one or more heterogeneous data sources.

Many times, this array of words can be confusing for users of Trino despite the useful delineation from database management systems and other tools that use a SQL interface to query data. Because of the similarity in the use of the SQL language to query Trino and databases, many users make certain assumptions about the software and its applications. For instance, they expect that Trino should be able to support insertions and deletions across all of the data sources it connects to. While these features are certainly useful and do make it into Trino for some data sources, the general stance is that Trino is first and foremost aiming to support OLAP (Online Analytical Processing) use cases.

The most simplistic way to summarize OLAP is that you generally focus on making reads faster by putting less emphasis on making the writes faster and durable. What durable means in this sense is that you have clear expectations around the state of the various tables in your database which are imposed through schemas, constraints, and transactions. This is how OLAP contrasts to OLTP (Online Transactional Processing) systems in that it doesn’t support ACID transactions. It may further confuse you to see that some connectors, like the Hive connector, do in fact support some notion of ACID transaction. This is, for now, the exception but not the rule, and there are some limitations to the type of INSERT, UPDATE, and DELETE operations you can do with these systems. In general, data used for OLAP uses, should be snapshots of operational data over time and should not be mutable. Therefore when evaluating when to use Trino, it’s a safe bet that if you’re reading out of a data lake or running federated queries across different data, you will likely want to use Trino. If you are writing to other data sources, you will need to analyze what your requirements are and what the capabilities of your connectors you wish to use are. Trino does a wonderful job at speeding up ETL jobs provided it is copying and transforming data and not merging it on insertion.

To summarize, there are many applications for OLAP but here are a few common cases where OLAP is used:

  • Integrating data across multiple applications
  • Reporting for various business functions
  • Forecasting and budgeting
  • Artificial intelligence and machine learning applications
  • Keeping the data cached at multiple levels
  • Visualization of data through Business Intelligence (BI) tools such as Tableau, lookr, and Preset

Ready to learn more about Trino?

Get your free copy of Trino: The Definitive Guide from O'Reilly

Download the ebook

Start Free with
Starburst Galaxy

Up to $500 in usage credits included

  • Query your data lake fast with Starburst's best-in-class MPP SQL query engine
  • Get up and running in less than 5 minutes
  • Easily deploy clusters in AWS, Azure and Google Cloud
For more deployment options:
Download Starburst Enterprise

Please fill in all required fields and ensure you are using a valid email address.