×

Tag: Data Lake

Showing 83 results

BestSecret’s data journey: Moving beyond Snowflake

August 16, 2023

Of all the seventy-plus speakers at the festival, there was one presentation that I found to be particularly interesting – and not because the speaker also happens to be our customer. That presentation was from Lutz Künneke, Director of Engineering, and Isa Inalcik, Senior Data Engineer, at BestSecret, a leading European online destination for off-price fashion based near Munich, Germany. As Künneke got to the stage, the first words out of his mouth were: “We are moving off of Snowflake.” 

GigaOm TCO report: Starburst data lakehouse enables 3x faster time to insight at half the cost

August 16, 2023

In a new report Cloud Data Warehouse vs. Cloud Data Lakehouse: A Snowflake vs. Starburst TCO and Performance Comparison, published by GigaOm, concluded that a Starburst lakehouse architecture could achieve superior price-performance and significantly faster time-to-insight at a much lower total cost of ownership (TCO).

Testing the boundaries of partitioning for data lake analytics

July 27, 2023

Discover how Starburst’s nanoblock indexing accelerates data lake analytics, optimizing queries, and reducing data reads. Try it in Starburst Galaxy for accelerated performance!

Starburst data lake certification and training

July 24, 2023

Data analytics certification program to learn about topics such as data lakes and data lakehouses, and modern table formats like Apache Iceberg.

Data pipelines and data lakes: Transforming raw data into actionable insights

July 20, 2023

ETL operates as the engine behind the data pipeline process, moving data from a raw state to a consumable one. Let’s unpack the way in which this typically operates in a modern data lake or data lakehouse. Later, we’ll take a tour to see how Starburst Galaxy fits in this picture and how it can be used to construct the Land, Structure and Consume layers typical of a modern data lake.

Google Looker and Starburst Galaxy: Modern, trusted BI for your modern data lake

June 20, 2023

With the Looker and Starburst Galaxy integration, teams can now extend Looker beyond data in Google Cloud services like BigQuery to other cloud data sources – including data in AWS and Azure. This means that Looker can now support customers with multi-cloud environments.

What to consider when designing a data lake analytics architecture for a startup

June 6, 2023

Of all the choices a startup has to make in its early stages, deciding on the right data analytics architecture might not seem critical,...

Accelerate AI with a data lake analytics platform

May 22, 2023

A data lake analytics platform is needed in order to bridge the gap between what can be a large number of analytical AI tools with data lakes, lakehouses, legacy systems and other technologies in the ecosystem. 

BCG landmark research: Spiraling data costs and complexity reach a tipping point

May 1, 2023

The number of unique data vendors has grown, tripling  in the past decade (from about 50 to close to 150 today), driven in a large part by massive data stack investments, which total about $245 billion between 2012 to 2021.

Fueling Trino large-scale geospatial analysis with Starburst Warp Speed

March 27, 2023

In our last post, we discussed two methods for running geospatial analysis with Trino and the Hive connector and explored a few optimization techniques...

Automated maintenance for Apache Iceberg tables in Starburst Galaxy

February 15, 2023

This post is part of the Iceberg blog series. Read the entire series: Introduction to Apache Iceberg in Trino Iceberg Partitioning and Performance Optimizations...

Lie #3 — You’re ready for the AI + ML deep end

February 3, 2023

You’ve hired pedigreed data scientists and engineers, invested in shiny new software, and perhaps even reorganized your entire business, all in the hopes of...

Lie #1 — A single source of truth

February 1, 2023

Technology vendors have long peddled a version of nirvana where all of a company’s data would be centralized in one location.  The “single source...

Simplified Cloud Storage Governance with Starburst and Immuta

January 4, 2023

Accessing data in cloud storage has been an ongoing challenge for analysts, data engineers, and organizations as a whole. Additional work is required to...

Over 80 Data & Analytics Statistics, Data, Trends, and Facts

December 28, 2022

Most organizations have data and continue to generate and collect it on a daily basis, but have a far more difficult time in getting...

4 tendances data à suivre en 2023

December 19, 2022

Par Martial Coiffe & Victor Coustenoble 2022 nous a confirmé que l’architecture data demeure au cœur des préoccupations des entreprises et organisations en France,...

Tableau Cloud + Starburst: New Connector Supports Shift to Cloud-based SaaS

December 19, 2022

The shift to cloud-based software-as-a-service platforms is accelerating in just about every tech industry. So it wasn’t much of a surprise to the analytics...

Apache Iceberg Time Travel & Rollbacks in Trino

December 7, 2022

This post is part of the Iceberg blog series. Read the entire series: Introduction to Apache Iceberg in Trino Iceberg Partitioning and Performance Optimizations...

Data Lake vs. Data Warehouse: How Data and Schema Interact

December 6, 2022

After years of building enterprise data warehouses, at first glance, a data lake architecture may appear to be similar to a data warehouse. After...

Building and governing a data mesh with Starburst and AWS Lake Formation

November 29, 2022

The increasing popularity of data lakes isn't surprising anyone in the analytics space. The appeal of importing data from multiple sources into a single...

Apache Iceberg Schema Evolution in Trino

November 22, 2022

This post is part of the Iceberg blog series. Read the entire series: Introduction to Apache Iceberg in Trino Iceberg Partitioning and Performance Optimizations...

Reliving the Hype: Highlights from Trino Summit 2022

November 18, 2022

Last week in San Francisco was one for the Trino history books. After three years of planning, rescheduling, planning, and rescheduling some more, Starburst...

Apache Iceberg DML (update/delete/merge) & Maintenance in Trino

November 17, 2022

This post is part of the Iceberg blog series. Read the entire series: Introduction to Apache Iceberg in Trino Iceberg Partitioning and Performance Optimizations...

Explore A New Way Of Utilizing A Data Lakehouse

November 10, 2022

A data lakehouse combines the principles of a data lake and a data warehouse to include the best of both worlds. Data lakehouses are...

Iceberg Partitioning and Performance Optimizations in Trino

November 8, 2022

This post is part of the Iceberg blog series. Read the entire series: Introduction to Apache Iceberg in Trino Iceberg Partitioning and Performance Optimizations...

6 Considerations for Choosing the Right Cloud Data Lake Solution

October 26, 2022

Data lakes have amazing attributes. For one, it enables us to handle vast, complex datasets. Data lakes offer an up-to-date stream of data that...

How to Create a Well Designed Data Lake Architecture?

October 13, 2022

Data lakes deliver unprecedented agility A data lake is an essential tool for big data analytics. A key advantage of developing a data lake...

Second Edition of Trino: The Definitive Guide

October 5, 2022

Starburst has played a key role in the Trino community for a long time now. We contribute  to the success of Trino every day....

Building Reporting Structures on S3 using Starburst Galaxy and Apache Iceberg

October 4, 2022

AWS S3 has become one of the most widely used storage platforms in the world. Companies store a variety of data on S3 from...

The Data Virtualization Evolution is Just Beginning

October 4, 2022

Data virtualization revolutionized the data infrastructure space by serving data consumers directly on top of data stores, without the need to move data elsewhere....

Delivering Text Search Capabilities Directly on the Data Lake with Starburst

September 29, 2022

In the big data analytics world, enabling analytics on unstructured text is a powerful capability. For that reason, it would be of use that...

Five Exciting Big Data Trends Worth Taking a Closer Look

September 20, 2022

After Covid-19, many business executives faced one of the toughest leadership tests to turn this challenge into an amazing opportunity. What did the business...

Rethinking SIEM Solutions

September 13, 2022

As organizations strive to become more agile, there has been a mass movement jumping headfirst into what is called a security data lake. Gartner...

The Difference Between Micro-Partitioning vs. Indexing and a Better Way

September 8, 2022

How to choose the right solution for your big data analytics engine When optimizing your analytics database performance, one of the most important decisions...

Data Lake Solutions Foster a Range of Analytics Use Cases

August 31, 2022

Data lakes enable the implemention of a wide range of solutions, including raw data collection, flexible data access for users, and building fast and...

Security Data Lake: Identifying Threats Faster

August 26, 2022

The glory days of SIEM are over. Security teams are not only measured by their ability to collect as much data as possible, but...

The choice is yours: Open source Trino and Starburst Galaxy

August 9, 2022

A few months back when Starburst Galaxy launched on AWS, Google Cloud, and Azure, I wrote a blog on What Fully-Managed Means to Starburst....

AWS Dev Day Recap: Data Lake Analytics with Starburst Galaxy

August 5, 2022

On Wednesday, August 3rd, I had the opportunity to share a hands-on lab exploring Data Lake reporting structures with my AWS partner in crime,...

Scaling Up: When to Migrate from PostgreSQL to a Data Lake

July 13, 2022

One of the true pillars of the tech revolution, PostgreSQL is an OLTP database designed primarily to handle transactional workloads. The technology has been...

Starburst Acquires Varada To Deliver Faster (and Cheaper) Data Lake Analytics

June 23, 2022

I’m excited to announce the acquisition of Varada, a data analytics accelerator, based out of Tel Aviv, Israel. Varada offers a data lake analytics...

Employee Perspective: Accelerating Data-Driven Insights in AdTech

June 16, 2022

Before I joined Starburst, I worked in the AdTech industry where companies buy and sell user data for online targeting advertisement campaigns or ML/AI-based...

Data Lake Analytics for Smart, Modern Data Management

May 27, 2022

Best-in-class organizations need fast, reliable data analytics that enable business leadership to identify patterns and key insights that will help them predict the best...

The Past, Present, and Future of Trino

May 24, 2022

Recently, I had the pleasure of chatting with Ravit Jain on his show “The Ravit Show” to discuss the evolution of Trino and where...

Starburst and Databricks Collaborate on the Trino Delta Lake Connector

March 24, 2022

This blog was co-authored by Claudius Li, Product Manager at Starburst, and Joe Lodin, Information Engineer at Starburst. Starburst recently donated the Delta Lake...

Starburst donates the Delta Lake connector to Trino

March 15, 2022

Starburst was founded around the Trino open source project. Many contributors and maintainers of the project are part of our teams. Our products, Starburst...

What A SQL Query Engine Can Do For Big Data

February 16, 2022

Nod with me if you’ve suffered from the following problems with processing and analyzing Big Data via a centralized approach: different query languages, niche...

The Top Six Reasons to Migrate to the Cloud

January 25, 2022

Starburst released the 2021 State of Data market research report, conducted by Enterprise Management Associates (EMA), in collaboration with Red Hat, early last year....

Starburst Stargate: One Cluster to Rule Them All

December 9, 2021

I think of Starburst Stargate as the Lord of the Rings feature. Or the galactic empire feature. In a prior blog post, I introduced...

Part 2 of Current Data Patterns Blog Series: Data Lakehouse

December 6, 2021

As companies shift their analytical ecosystems from on-premise to cloud and try to avoid “data lock-in”, we’re noticing some very interesting data patterns. This...

Azure Data Lake: Powered by Starburst Galaxy

December 2, 2021

Earlier this week, we announced the launch of Starburst Galaxy on Microsoft’s Azure cloud service. Starburst Galaxy is the new fully-managed SaaS service from...

Tableau is Just Better with Starburst

November 15, 2021

I’m one of those strange people who has always enjoyed doing performance testing. The thought of spinning up lots of machines to do my...

The Analytics Engine for Distributed Data

October 1, 2021

The idea of a single source of truth has been around since the beginning of big data. However, over the years, through the data...

Data Mesh: Embracing Decentralized Data Paradigms

September 20, 2021

Many data and analytics practitioners have heard about this socio-technical paradigm shift, Data Mesh, and would like to learn more. But before describing what...

Dynamic Filtering: Supporting High Speed Access to Data

September 20, 2021

Analysts are often tasked with deriving insights for business units where the data can span multiple locations.  This is increasingly true today when the...

Accelerating Data Science with Trino

August 31, 2021

At our Datanova for Data Scientists conference on July 14, I held a discussion with Dain Sundstrom and David Philips, CTOs of Starburst, about...

Part 1 of Current Data Patterns Blog Series: Hybrid Distributed Data Store and RDBMS

August 12, 2021

As companies shift their analytical ecosystems from on-premise to cloud and try to avoid “data lock-in”, we’re noticing some very interesting data patterns. This...

Redefine Your Analytics Without ETL Using Starburst and Amazon EKS

June 14, 2021

Amazon EKS and Starburst help new users easily adopt, manage, and operate Kubernetes ...

Starburst Stargate: The Final Frontier in Analytics Anywhere

June 9, 2021

Today we announced Starburst Stargate, the industry’s first gateway for global cross-cloud analytics. I’m excited to share more behind why we built this and...

Trino on Ice IV: Deep Dive Into Iceberg Internals

June 8, 2021

Welcome back to the Trino on Ice blog series that has so far covered some very interesting high level concepts of the Iceberg model,...

Starburst Supports Launch of Delta Sharing, the First Open Protocol for Secure Data Sharing

May 26, 2021

At Starburst, we believe in building optionality into your data architecture & strategy. To us, optionality means building for flexibility so that you don’t...

Trino on Ice III: Iceberg Concurrency Model, Snapshots, and the Iceberg Spec

May 25, 2021

Welcome back to this blog series discussing the amazing features of Apache Iceberg. In the last two blog posts, we’ve covered a lot of...

Trino on Ice II: In-Place Table Evolution and Cloud Compatibility with Iceberg

May 11, 2021

In-place table evolution and cloud compatibility with Iceberg ...

Trino On Ice I: A Gentle Introduction To Iceberg

April 27, 2021

We’re excited to debut this blog series ‘Trino on Ice’ with a gentle introduction to Iceberg. Stay tuned for future posts from the Trino...

The Great Data Architecture Debate: Data Lake, Data Warehouse or the Data Lakehouse?

April 6, 2021

This is a crazy and slightly confusing time in the data architecture space. More and more companies are shifting toward data lakes, yet the...

Understanding the Starburst and Trino Hive Connector Architecture

February 18, 2021

After a decade of running Hive queries on their data lakes, many companies are astonished at the speeds in which they are able to...

The Future of Analytics: In Conversation With Matt Fuller

February 5, 2021

Datanova is just next week. More than 2,000 data and analytics leaders will join us to learn more about how to unlock the value...

6 Reasons to Attend Datanova 2021: #2, The Oxford Debate

January 25, 2021

Datanova 2021 is going to have plenty of panels and informative content for anyone interested in the future of big data management. We're also...

Top 10 Reasons to Migrate from OS Presto on EMR to Starburst Enterprise Presto

November 13, 2020

In today’s data architecture economy, there are no shortages of options when it comes to choosing various distributions and deployment strategies for a given...

The Death of Apache Drill

August 6, 2020

One of the things that really drew me to and got me excited about Presto over 4 years ago was that it wasn’t tied...

Presto & Data Science: Getting Data Into the Hands of Data Scientists (Faster)

June 26, 2020

A few days ago I read a Gartner report stating that data scientists spend 23% of their time on data collection and preparation. I...

How a Telecommunications Giant Established Universal Data Access

April 3, 2020

  Our customer base has been growing quickly, and we’re excited to share a case study highlighting one of our largest clients, a telecommunications...

The 4 Stages to Big Data Nirvana (In the Cloud)

July 18, 2019

Nirvana - a state of perfect happiness; an ideal or idyllic place.  In big data “Nirvana” is a wishlist of items: The ability to...

Starburst Presto & Databricks Delta Lake Support

June 13, 2019

TL;DR - Starburst Data is excited to announce Presto Databricks Delta Lake compatibility.   Delta Lake The big data ecosystem has many components but...

Your analysts don’t care where the data lives, neither should you

April 30, 2019

In today’s enterprise, data is arguably one of the most valuable assets. With advances in data analytics technology, enterprises can more easily convert enormous...

Starburst brings an enterprise-ready Presto to Microsoft Azure & HDInsight users

April 17, 2019

Parts of this blog were co-authored by Ashish Thapliyal Principal Program Manager, Azure HDInsight Starburst Data is excited to have our latest release, Starburst Presto 302e,...

The Art of Abstraction: the continuing separation of compute and storage for data analytics

December 4, 2018

We recently invited 451 Research VP, Matt Aslett to share his thoughts and observations on the practice of separating the storage and computation of...

Querying data in S3 using Presto and Looker

July 10, 2018

With more and more companies using AWS for their many data processing and storage needs,  it’s never been easier to query this data with...

Starburst’s Presto on AWS up to 18x faster than EMR

June 26, 2018

Karol Sobczak & Anu Sudarsan, Co-Founders & Software Engineers at Starburst IntroductionLast week, we announced the availability of Starburst’s Presto on AWS Marketplace. With this...

Presto Available on AWS Marketplace!

June 19, 2018

  Today I am excited to announce the availability of Presto on AWS Marketplace by Starburst. The Presto AWS Marketplace offering is based on...

Data Lakes without Hadoop

May 14, 2018

It seems like migrating to the cloud has dominated the news and a lot of companies are shuttering their data centers and letting cloud...

Presto gets EVEN FASTER, with a 10-15x performance boost in upcoming release!

March 20, 2018

Next week, we will be releasing the Starburst Distribution of Presto 195e. Based on prestosql/presto 0.195, Starburst’s 195e will ship with Presto’s first cost-based...

AWS Data Analytics Platform – Starburst Data’s Vision

February 21, 2018

As more and more companies turn to low-cost object storage to store a majority of their data, providing easy access to this data has...

The Floodgates Are Open – S3 to EC2 now at 25Gbps!

January 31, 2018

Some great news from the AWS folks. They have increased network bandwidth for EC2 instances when communicating with S3. For those of you that...

Start for Free with Starburst Galaxy

Up to $500 in usage credits included

Please fill in all required fields and ensure you are using a valid email address.

Start Free with
Starburst Galaxy

Up to $500 in usage credits included

  • Query your data lake fast with Starburst's best-in-class MPP SQL query engine
  • Get up and running in less than 5 minutes
  • Easily deploy clusters in AWS, Azure and Google Cloud
For more deployment options:
Download Starburst Enterprise

Please fill in all required fields and ensure you are using a valid email address.

s