There’s still time to register for Datanova!
Our keynote kickoff is “Fun with Founders: Expectation vs. Reality,” featuring Matt Fuller from Starburst, Drew Banin from dbt Labs, Maxime Beauchemin from Preset, and Michel Tricot from Airbyte.
Join us to see what the most outrageous thing they’ve done to win over a customer.
It’s 10AM and showtime with our talented Monica Miller!
Fun with founders: expectation versus reality with Matt, Drew, Max and Michel
Matt Fuller on wearing shoes two sizes too small to a customer meeting
Drew Banin on working on a secret project, which is now dbt labs
Max Beauchemin on the importance of not using sensitive data
Michel Tricot on developing features in real time
Is data mesh the end of data engineering?
Finally! Who will win this debate? Let’s see.
Seems like Reis, Mott and Tartow are continuing the debate in the parking lot.
In the meantime, sign up to participate in the next Data Mesh Book Club. (Next in the series, Andy and Adrian will be discussing Fundamentals of Data Engineering)
Listen to all the latest episodes during your commute on Apple Podcast!
Data products for everyone featuring Glovo
Simone Grandi, Head of Products at Glovo explains how they went from MVP (i.e. build 1-2 data products) to self-service w/ 90 data products in production.
Vishal Singh, Head of Data Products at Starburst use Starburst Data Products to demonstrate a quick and easy way to get started with setting up and running a data mesh in your organization.
Within a few seconds, the data owner can find out who has access, assign and revoke access, how it’s been used, add new data sources and explore data sets.
Vishal: In Galaxy, you can share queries across different roles according to assigned permissions…and collaborate with other members to create new insights and data products.
Check out yesterday’s talk on the Starburst product vision.
Sign up for the upcoming Starburst Galaxy Lab, April 4th, 2023 1-2pm EST
Building a data analytics platform with a lakehouse at 7bridges
Postgres didn’t scale, hence a new data architecture.
Simon: Remove compute as much as possible. To move our data into a better analytical format, which is great for our AI and reporting. The important thing is to work on a lakehouse architecture….It gives us a flexible environment to work with.
The architecture behind fault-tolerant execution
Trino was designed to replace Hive…We enabled data engineers to run analysis much more quickly.
Challenges with original architecture: organizations aren’t a big fan of having big clusters to satisfy a few queries and remain idle the rest of the time as well as resource management.
Goals: How do we get Trino work better in the face of failure? The most immediate one: We wanted to Trino to tackle node failures.
In addition: run queries…reliably. More flexible strategies for resource management. (i.e. Cancel first query and select the second one?) How can we dynamically adapt a query?
Benefits of this new architecture: Being able to recover from a single task, this reduces the overall time and latency. The amount of memory you need — is what you use, and accommodate the queries based on the memory you have.
Unlocking the power of analytics and AI to create data products and apps at Deloitte
The best of both worlds: Achieving query latency and flexibility with Apache Pinot and Trino
dbt and Starburst: better together
In data & people we trust: building a reliable data platform with Trino and data observability
Data observability starts with your ‘why’
The people define what data quality means to them and agree to uphold the standards.
We can start thinking about the process and changing behavior and data culture
Common Scenarios on how Assurance use Monte Carlo
Claim your early release copy (a $67 value) Free O’Reilly Book, Data Quality Fundamentals
An introduction to data contracts with Chad Sanderson and Monica Miller
Chad: If data quality was poor, we would make terrible predictions on a shipment….We lost a million dollars over a couple of days because of those issues…
A data contract is an API agreement between data producers and consumers on the semantics, the SLA, etc.
When we talk about data quality, we often place the responsibility on the consumer/data engineer: to monitor it, root cause it, solve the problem — which they often…can’t!
Data quality failures often happen upstream…
Chad’s Data Quality Camp
Get in the driver’s seat of your AI industrialization using Dataiku
The promise of AI
To win the AI race, manage the right balance
Three parameters to win the AI race without losing control
How can to leverage your existing investments and innovate simultaneously and other questions others are asking
Six ideas on why data engineering might fail with Benn Stancil cofounder and Chief Analytics Officer at Mode
- Data engineering is boring and no one wants to do this job.
- We all get fired. Because we cost too much.
- We all get fired. We’re not that valuable.
- We’re replaced by tools.
- We’re replaced by other people.
- We’re automated by AI.
Thank you for joining us
We hope we delivered on valuable content and hope to see you next year. Perhaps…in person?!
If you're a data rebel, register now.
Free. Virtual. Global.