
Querying large datasets without a partitioning strategy is often an exercise in patience. Without it, a compute engine like Trino has no choice but to scan every file in an Apache Iceberg table to find the rows that match a query, regardless of how selective that query actually is. Apache Iceberg partitioning solves this by organizing data into smaller, logically grouped subsets so that only the files relevant to a given query ever need to be read. Using this method, the difference in practice can be dramatic, turning Iceberg queries that take minutes into ones that return in seconds. As data lakehouse volumes grow, including through the use of AI workloads, that kind of efficiency becomes foundational rather than optional.
Why Apache Iceberg is gaining ground
Apache Iceberg improves on older partitioning approaches in ways that matter for day-to-day data engineering work on modern data lakehouses. Where Hive table formats required users to manually maintain partition columns and write queries that explicitly referenced them, Iceberg handles partition pruning transparently through its metadata layer. Users query Iceberg tables naturally using standard SQL, and Trino takes care of the rest. Importantly, the same metadata architecture that makes selective Trino queries fast also makes large Apache Iceberg tables well-suited to the kind of high-concurrency, mixed analytical and AI workloads that Starburst increasingly supports.
This post walks through how Iceberg partitioning works in Trino, covers the performance optimizations that sit alongside it, and explains how to maintain healthy Iceberg tables over time using built-in file management features, including compaction, snapshot expiration, and orphan file removal. It is part of the Iceberg blog series. Read the entire series:
- Introduction to Apache Iceberg in Trino
- Iceberg Partitioning and Performance Optimizations in Trino
- Apache Iceberg DML (update/delete/merge) & Maintenance in Trino
- Apache Iceberg Schema Evolution in Trino
- Apache Iceberg Time Travel & Rollbacks in Trino
- Automated maintenance for Apache Iceberg tables in Starburst Galaxy
- Improving performance with Iceberg sorted tables
- Hive vs. Iceberg: Choosing the best table format for your analytics workload
Partitioning
Partitioning is used to narrow down the scope of the data that needs to be read for a query. When dealing with big data, this can be crucial for performance and can be the difference between a query that takes minutes or even hours and one that takes seconds!
One of the advantages of Apache Iceberg is its handling of partitions. One of the biggest drawbacks of using Hive-based tables was the way you had to partition your data.
How Hive and Iceberg handle partitioning differently
For example, most tables you plan to partition have a date or timestamp column indicating when the row was created. Example table:
| Column Name | Datatype |
| event_id | integer |
| created_ts | timestamp |
| metric | integer |
In Hive, if you wanted to partition by day, you would have to split the created_ts column into year, month, and day. Then you would have to teach your users to always include these columns in their query, even if they wanted to query on created_ts.
create table hive.orders (event_id, integer, created_ts timestamp, metric integer, year varchar, month varchar, day varchar);
Creating and querying a partitioned Iceberg table in Trino
With Iceberg, you simply partition the data on created_ts using day, and end users would query this table just like they would in a database. Here is an example:
-- create iceberg table partitioned by day on the created_ts column
create table orders_iceberg
(event_id integer, created_ts timestamp(6),metric integer)
with (type='iceberg',partitioning=ARRAY['day(created_ts)']);
-- insert rows
insert into orders_iceberg values (1,timestamp '2022-09-10 10:45:38.527000',5.5);
insert into orders_iceberg values (1,timestamp '2022-09-11 03:12:23.522000',5.5);
insert into orders_iceberg values (1,timestamp '2022-09-12 10:46:13.516000',5.5);
insert into orders_iceberg values (1,timestamp '2022-09-13 04:34:05.577000',5.5);
insert into orders_iceberg values (1,timestamp '2022-09-14 09:10:23.517000',5.5);
-- query the table only looking for certain days
select * from orders_iceberg where created_ts BETWEEN date '2022-09-10' AND date '2022-09-12';
The data in Amazon S3, for example, looks like this:

Trino is smart enough to read the Iceberg Manifest List and then only look at files that meet the query’s partition requirement. In the example above, it would only be 2022-09-10 and 2022-09-11. A list of functions to partition by can be found here.
Note: Trino’s Iceberg implementation includes the timezone for the timestamp data type (timestamp(6)). This was a conscious decision based on the industry standard of supporting time zones within timestamp data types that Hive didn’t support.
Changing your partitioning strategy without rebuilding your table
Although we’ll cover this in a separate schema evolution blog, you aren’t stuck with this partitioning scheme. At any time, you can modify your partition column. For example, if we decided that partitioning on day is too granular, we can modify the table to now be partitioned by month:
alter table orders_iceberg SET PROPERTIES partitioning = ARRAY['month(created_ts)'];
New data will be created in directories named: created_ts_month=2022-09, for example. The existing data will remain partitioned by day unless the table is recreated.
How Iceberg uses metadata to skip files and improve query performance
When it comes to performance, Iceberg can be highly efficient. This is because metadata is stored for all the files in a table at a given snapshot in time, along with statistics for each file. Trino uses this metadata to skip files that do not contain data relevant to the query, avoiding unnecessary reads entirely.
With partitioning in place, that process becomes even more targeted. Trino first applies partition pruning to eliminate entire partitions that fall outside the query’s scope. It then examines the manifest file metadata for the remaining files, using column-level statistics such as minimum and maximum values and null counts to determine which files actually need to be read. When the data within those files is also sorted by a column referenced in a where clause, the number of files that require a full read drops further still.

Compacting small files with the optimize command
Iceberg includes file management features that improve performance. Traditional data lakes run on Hive have use cases where data is constantly ingested. This data is stored in small files because it needs to be available for immediate querying. This can hurt performance in any system that needs to read a bunch of small files, especially in cloud storage. Iceberg includes an optimization feature that combines small files into larger ones, ensuring maximum performance when querying.
The idea here is to ingest data as fast as possible, making it available for queries even if it might not be the highest-performing, and then offer the ability to combine those files into larger ones at a given interval.
To scan the table for small files and make them larger, you simply issue the following command:
alter table <table> execute optimize;
This will look for any files under 100MB and combine them into larger ones. You can also choose the file size if 100MB:
ALTER TABLE <table> EXECUTE optimize(file_size_threshold => '10MB')
If your Iceberg table becomes very large and the optimize command above is taking too long to run, you can just optimize the files that have arrived recently:
ALTER TABLE <table> EXECUTE optimize where "$file_modified_time" > current_date - interval '1' day;
This will look for files that have arrived since yesterday and optimize them. On a very active table where many changes are taking place, this will greatly reduce the time the optimize command takes.
Example of this in action
Network events are streamed in 1-minute intervals. Small files are dropped into an S3 bucket using the Iceberg api and the data is available immediately using standard SQL. Based on the volume of data and the number of files created, the optimize command can be run at specified intervals to consolidate smaller files into larger ones. This will greatly improve the performance of subsequent queries against this table.
Cleaning up snapshots and orphan files
From time to time, older table snapshots should be cleaned up. These older snapshots contain previous states of the table, which are no longer needed.
There are two operations that clean up old snapshots and data. One is “expire_snapshots,” and the other is “remove_orphan_files.
expire_snapshots – This function removes snapshots older than the value provided at execution time. An example is:
ALTER TABLE <table> EXECUTE expire_snapshots(retention_threshold => ‘7d’)
which will remove snapshots that are older than 7 days.

remove_orphan_files – This function removes files that are left on storage when a query is unable to complete for a variety of reasons. This doesn’t happen too often, but it’s a good idea to include this when you run snapshot cleanups. A similar ALTER TABLE statement is used as shown in this example:
ALTER TABLE test_table EXECUTE remove_orphan_files(retention_threshold => ‘7d’)
—
Running these two operations together on a regular schedule is good practice for any active Iceberg table.
Taking the next step with Iceberg and Trino
Ready to go further with Iceberg and Trino? The next post in this series covers DML operations, including updates, deletes, and merges, as well as deeper maintenance strategies. Read Apache Iceberg DML and Maintenance in Trino.
References
https://trinodb.slack.com/archives/CGB0QHWSW/p1662540239215579










