Trino running queries on Alluxio supporting S3:// paths not alluxio:// paths

In the product architecture built on Iceberg with storage on S3, Trino is the query engine and Apache Polaris the catalogue.
In order to improve the Trino query performance and latency we are studying Alluxio as the cache over S3.
We have enabled Alluxio by setting the configuration:
fs.alluxio.enabled=true
fs.native-s3.enabled=false

Since the data in catalogue is already having S3 paths like:
s3://bucketname/…/snap-4624011807584888305-1-caae445a-575b-4002-b134-ecfb3fc19ab5.avro
We want that trino can map the s3 paths implicitly to alluxio:// paths as we need to work on existing data to support and we dont want hard dependency of alluxio in the system and at any moment we can remove it and directly access S3 from Trino.

Is there a way to achieve this, we tested but it didn’t work, does Trini support this use case?

Or Trino needs alluxio:// paths always if alluxio is enabled?

Please suggest.

I see the solution mentioned is here: https://www.alluxio.io/blog/integrate-alluxio-with-your-existing-data-stack-without-redefining-hive-tables

But it is available in enterprise edition not in community edition?

1 Like

Hi vickytaurus, I’m Amelia from the Alluxio community. Happy to chat and share with you more case studies for using Alluxio with Trino as well as answer any questions you may have. My email is amelia@alluxio.com or welcome to join our Alluxio community slack: Join Alluxio Community on Slack

1 Like