What a day! On May 5, 2022, we had a truly worldwide gathering with people tuning in from six different continents to celebrate Cinco De Trino with us. Our half-day conference featured some amazing things happening within the Trino community, including sharing some really relevant data lakehouse content as well as highlighting the successes of Project Tardigrade.
Our hosts, Brian Olsen, Manfred Moser, and Commander Bun-Bun knocked it out of the park by guiding us through a wonderful event that left us all singing, “Querying away again in Trinoritaville”.
After watching the awesome presentation that Commander Bun Bun put together, Martin Traverso, Starburst CTO and Trino co-creator, kicked off our celebration with the keynote discussing Trino as a Data Lakehouse. We learned about how the building blocks of Trino make it a perfect ecosystem of technologies to satisfy lakehouse architecture, as well as caught a sneak peek into some exciting new features such as polymorphic table functions, adaptive query planning, and many more! From there, our stellar lineup of speakers put on an incredible show sharing meaningful and engaging information that will enrich the Trino user experience.
You can watch all the sessions on-demand.
The key highlights of each session are captured below:
Commander Bun Bun no longer tolerates failures. Zebing Lin, Software Engineer at Starburst, explains the new execution paradigm available specifically for ETL analytics due to the implementation of Project Tardigrade. We saw a demo of these capabilities live in action, and we also heard Cory Darby from BlueCat highlight how the addition of Project Tardigrade and query retries “…saved the bacon of BlueCat.” Utilizing the fault-tolerant execution architecture in BlueCat’s long-running queries has yielded a better and more cost-effective solution that provides reliable completion time and elasticity to meet business needs.
Stream the Project Tardigrade session on demand.
Try for yourself the Project Tardigrade demo.
Starburst Galaxy Lab with Trino Co-Creator
See for yourself the capabilities of Starburst’s fully managed cloud offering, Starburst Galaxy, as the CTO and co-creator of Trino, Dain Sundstrom, walks you through a lab to explore ELT in Trino. After demonstrating how easy it is to create clusters and configure catalogs, Dain shows how to gather data from multiple sources into landing tables, convert that data into structured tables, and then create consumer tables which can be queried using Tableau.
Stream the Starburst Galaxy Lab session on demand.
To continue the Galaxy conversation, we invite you on June 2, 2022 to join a webinar with Dain.
Bring your data into your data lake with Airbyte
Airbyte is an open-source ETL/ELT tool used for straightforward data consolidation. Abhi Vaidyanatha, Senior Developer Advocate at Airbyte, shows how Airbyte can take away the pain experienced when trying to pull data in from multiple sources, including APIs. We watched Airbyte validate connections, add sources and destinations, and remove the headache faced by the Trino community of underperforming queries due to unconsolidated data.
Stream the Airbyte session on demand.
Read Abhi’s article about his talk at Cinco De Trino.
More data, more problems! In this session, James Campbell from Great Expectations addresses what we all should be striving for while “…solving the challenge of working with data, especially in a dynamic organization.” We watch as he creates simple and effective tests for our data “expectations” in Python, JSON, and even human-readable formats. We also get a Master Class in how to utilize Trino to easily test our “expectations” as they are being developed. Now, Trino users can have even more advanced data observability with the announcement of the Great Expectations Trino Connector!
Stream the Great Expectations session on demand.
Data Architecture in 2022
What does the modern landscape look like today? Ryan Blue, co-creator of Iceberg, explains why everyone wants a multi-engine platform and how this can be achieved successfully using flexible compute and SQL warehouse behavior. As an open standard for tables with SQL behavior, Iceberg is uniquely positioned to act as a foundation in both the data warehouse and the data lake worlds, as well as unify the two.
Stream the Iceberg session on demand.
Trino + dbt: Transformations in SQL Heaven
Watch this speed typing superstar, Jeremy Cohen from dbt Labs, demonstrate how combining Trino and dbt is a recipe for SQL success. By introducing modular data modeling, dbt can help break down those custom and complicated SQL queries into optimized sections. In combination with Trino, data engineers can “build data like developers build applications.”
Stream the Trino + dbt session on demand.
Check out Jeremy’s demo repo for Cinco De Trino.
Building Reliable Lakehouses with Delta Lake
Why do you need a data lakehouse? In this session, Denny Lee, Databricks Senior Staff Developer Advocate, talks about the evolution of Data Management. Data lakehouses combine the best of both worlds of databases and data lakes, and Delta Lake helps to enable quality lakehouse architecture. We also get to hear about the work being done to build the ecosystem around Delta Lake, including supporting multiple languages as well as data processing systems like Trino.
Stream the Delta Lake session on demand.
Touch, Talk and See your Data with Tableau
Vlad Usatin of Tableau shares how to give meaning to your data. We see how to connect Tableau using the Starburst Enterprise connector within Tableau. After connecting, a dashboard can be built with only clicks and human language, no code required. We also get to see the interactive and intuitive nature that Tableau provides to create your own dashboards.
Stream the Tableau session on demand.
Lakes and Houses: Put into Perspective
Everyone wants a successful data architecture, but how do we get there? Vinoth Chandar from Apache Hudi discusses the benefits and pitfalls of warehouses vs Llakehouses by deep diving into architecture, capabilities, and price/performance. We then hear how to use an interoperable lakehouse as the backbone of your data architecture in order to achieve the status of implementing a successful data architecture.
Stream the Hudi session on demand.
Thank you to all who attended, we hope to see you again at our upcoming events later this year.
A huge thank you to our speakers and our sponsors for their support, this event truly would not have been this special without your contributions.