At Starburst, we want to free our customers to see the invisible and achieve the impossible. Seeing the invisible means allowing you to have access to data you never had access to before. Achieving the impossible means unlocking new insights through greater access to all of your enterprise information. We do that by turning the data warehousing model upside down.
If we think back in history, in 1978 Teradata was the first data warehousing appliance that consolidated data in one place–bringing all of the data together through ETL and various pipelines and ultimately creating this notion of a single source of truth in one database system. Not much has really changed between then and now, Snowflake is the same model, but just available in the cloud.
Ultimately, all of these designs are dependent upon taking data out of your source systems and moving them into one single repository. We believe that the enterprise data warehouse model is actually slower than it looks because it’s not just about query performance, it’s about that total time-to-insight. It also creates vendor lock-in because you’ve now put all of your most valuable data in the hands of a single vendor. Over time this might also create exponential and often unexpected costs because you’ve given that vendor a monopoly over your data. It also limits your view to what is in that EDW right now at that particular moment in time.
A recent survey shows that 38% of respondents have said that the main challenge they’ve faced is difficulty integrating legacy systems with newer digital systems and 37% said they had data silos throughout their organization.
The way we think about approaching these challenges is by imagining a city in darkness and Starburst is now restoring electricity to that city. We might start by lighting up one block at a time. That first block is where more data lives than anywhere else. We’ll call this the Data Lake. The Data Lake plays a very important role and is a natural center of gravity because it offers open data formats which are interoperable with multiple engines. It provides the lowest Total Cost of Ownership (TCO) because it leverages commodity storage, whether that’s on-prem or in the cloud or multiple clouds. And lastly, you get that perfect elasticity by separating storage from compute, making the Data Lake an ever-important element of what you do.
However, no matter how hard you try, you will always have other data sources. One reason is what I like to call the Stonebraker Principle. Mike Stonebraker, a renowned professor at MIT and creator of Ingres, Postgres, Vertica, and a number of database systems, said there is no one size fits all database system and that, for every type of job you’re going to have a database that is purpose-built for that use case. You’re going to use Mongo, you’re going to use Kafka, you’re going to use Elastic, Oracle, and the list goes on! You’re also going to have Departmental Data Marts and you’re going to have transactional systems and operational systems. You’re going to inherit even more data assets through mergers and acquisitions and you’re going to create new applications and new devices – all of this will create more and more data silos within your organization.
What Starburst aims to do is light up the entire city for you by providing visibility into all of the data that you have. But it doesn’t stop there. This applies to a global perspective, with multiple regions, multiple clouds, and then the ever-changing global regulatory requirements which necessitate this decentralized approach by keeping data where it was created in the first place. Starburst is here to light up your world!
For a demo, visit us at starburst.io