Cookie Notice
This site uses cookies for performance, analytics, personalization and advertising purposes.
For more information about how we use cookies please see our Cookie Policy.
Manage Consent Preferences
These cookies are essential in order to enable you to move around the website and use its features, such as accessing secure areas of the website.
These are analytics cookies that allow us to collect information about how visitors use a website, for instance which pages visitors go to most often, and if they get error messages from web pages. This helps us to improve the way the website works and allows us to test different ideas on the site.
These cookies allow our website to properly function and in particular will allow you to use its more personal features.
These cookies are used by third parties to build a profile of your interests and show you relevant adverts on other sites. You should check the relevant third party website for more information and how to opt out, as described below.
Welcome to the first issue of the Presto Newsletter.
[mc4wp_form id=”956″]
Presto Summit 2018 recap
The first ever, all-day Presto Summit brought together many Presto users, committers, and other big data analytics fans. Participants from over 40 companies joined us on July 16th. The agenda was filled with high-quality talks from some of the leading members of the Presto community.
Here is a link to the topics covered and the slides:
https://www.starburstdata.com/
Querying 8.66 Billion Records – a Performance and Cost Comparison between Presto and Redshift (including Spectrum)
This is a very detailed post from Ernesto at Concurrency Labs comparing Presto to Redshift. The comparison includes cost and performance for both solutions and is worth the read:
https://www.concurrencylabs.
Using Presto to query on-premises object stores
Nitish at Minio, a distributed object store for private clouds (roll your own S3..), wrote a great post on creating your own object store analytics hub using Presto:
https://blog.minio.io/presto-
Demo: Querying Presto from Qlik Sense
This demo shows how easy it is to use Qlik Sense to query Presto:
https://www.youtube.com/watch?
Presto query optimizer: Pursuit of performance
Starburst CTO Kamil Bajda-Pawlikowski and Facebook’s Martin Traverso presented at the DataWorks Summit on Presto’s new cost-based optimizer:
https://dataworkssummit.com/
Using Presto for GeoSpatial Analytics
Also at DataWorks Summit, Uber engineers talked about using Presto for GeoSpatial Analytics:
https://dataworkssummit.com/
Presto at Tivo, Boston Hadoop Meetup
See how Tivo uses Presto for SQL analytics. This excellent presentation covers a few important topics:
– TIVO’s decision-making process – choosing Presto over Redshift Spectrum
– Choosing the correct AWS instance type for their Presto workloads
– How the different memory structures in Presto work together
– Using MySQL and S3 together to create TIVO’s data warehouse
https://www.slideshare.net/
Presto TPC-DS benchmark on AWS
Before introducing the Presto cost-based optimizer, Presto had issues with running all TPC-DS queries. That’s no longer the case, plus the performance is much better than the older versions of Presto:
https://www.starburstdata.com/
Big Data File Formats – ORC, Parquet & AVRO
At Starburst, we field a lot of questions from customers and prospects on which source file format to use. The answer is usually situation-dependent. This article on Datanami from Alex Woodie does an excellent job of breaking down each format and their advantages and disadvantages in different situations:
https://www.datanami.com/2018/
3rd party Presto benchmarks
Here are two excellent articles on Presto performance comparison benchmarks. It’s no wonder Presto’s popularity has exploded over the last few years:
http://bytes.schibsted.com/
https://virtuslab.com/blog/
Starburst Presto 203e released:
https://www.starburstdata.com/
-AWS Glue Integration
-New geospatial functions and improved geospatial function performance
-Additional SQL subquery support
-Add SQL FILTER clause for aggregations
-Column-level access control
-Support for authentication with JWT access token
-Various bug fixes that continue to improve the robustness of Presto
-Improvements to query scheduling and resource management
We would like to thank the members of the Presto community for the following contributions:
-Maria Basmanova from Facebook – new geospatial functions and optimizations
–Rentao Wu from AWS – Glue Catalog support
-Li Ding – SQL FILTER clause for aggregations
and many, many more!
Iceberg – A modern table format for big data from Netflix
During the first-ever Presto Summit last week, Netflix presented “Iceberg,” a new file format for storing large, slow-moving tabular data. Their presentation and Github links:
https://www.slideshare.net/
https://github.com/Netflix/
[mc4wp_form id=”956″]
© Starburst Data, Inc. Starburst and Starburst Data are registered trademarks of Starburst Data, Inc. All rights reserved. Presto®, the Presto logo, Delta Lake, and the Delta Lake logo are trademarks of LF Projects, LLC
Up to $500 in usage credits included