Looking to Resist the Pull of Data Gravity?

By: Jacqueline Vail
December 3, 2021
Share: Linked In

The ability to make data driven decisions is a fundamental part of many organizations. However, Data Gravity can make using all of your data increasingly difficult. Coined by Dave McCory in 2010, the general principle of Data Gravity is that the closer you are to a data source, the faster access you have. He also sat down to discuss this topic during our Datanova conference with our CEO, Justin Borgman, and Ryan Witcomb, Vice President of Data Analytics and Engineering at ScotiaBank. So, what do you do when you feel the pull of Data Gravity? One of the solutions is by accessing your data where it lives through an abstraction layer, Starburst.

What is Data Gravity?

Dave McCory first came up with the idea of Data Gravity when discussing physical, planetary gravity with a colleague, and he realized that the general concept could be applied to data. An increase in mass, or volume of data, increases the attraction. However, you also need activity. The combination of massive amounts of data and tremendous activity or interest in data creates the strongest effects. The closer an application is to the data, the more efficient and accurate the query on the data is. “Data Gravity inhibits enterprise workflow performance, raises security concerns, and increases costs, all complicated by regulatory requirements and other artificial constraints,” according to this Digital Realty piece authored by Dave McCory himself.  Now, when many companies are making the move to storing their data in a cloud and moving it away from its original data source, which many are, the challenge of Data Gravity has become increasingly common.

Moving to the Cloud: Increasing the Gravitational Pull

ScotiaBank, a 187-year-old business, has been through it all in terms of data management. Their journey is reminiscent of many other companies: they began with an enterprise data warehouse and made the move to an enterprise data lake. Therefore, they faced many of the same challenges. One of these being that when they decided to make the move to the cloud, from being heavily integrated with on premise storage, they were fighting a gravitational pull. Data Gravity wanted to hold these applications close to their source, so by moving them to the cloud, the whole system would collapse. Their solution to these challenges was to adopt a hybrid approach. By maintaining a presence on premise they could satisfy the integrations that currently existed there. But, by also enabling future capabilities in the cloud, they were able to have the data they needed readily available.

Who is Affected by Data Gravity?

This concept is particularly important for banks and the financial services industry. They often have to consider not only volume but also the complexity and velocity of data. The Anti-Money Laundering industry has challenges in both of these regards and also the challenge of dealing with different data systems that have different needs elsewhere. The problem becomes knowing how to enable one common centralized function to leverage different data sets. Creating an abstraction layer, like Starburst, plays a big role in this. Because it decentralizes data solutions, users don’t have to worry about the intricacies of the data. Common data discovery and common interfaces allow for all users to access it.

For cloud native companies, there is one homogenous security environment, so the security standards are different than on premise. One common path to security enables a faster path towards moving applications between the cloud and on premise. Fast access, accuracy, and governance are essential principles in terms of data management, but the security and protection of client data should never be sacrificed.

What is the Solution to the Problem?

Starburst and a Data Mesh fit into this idea of Data Gravity. Having a single source of secure access to data can help fight the strong pull. All of your data is close to you and ready for analytics without having to copy and move it constantly. By providing companies with an abstraction layer and quick access to distributed data, organizations can achieve a decentralized data solution and overcome some of the problems of Data Gravity.

For every non-cloud native company, being able to perform hybrid analytics, across platforms, is essential. McCrory suggests supporting central data centers and helping organizations create those to bridge the gap and fight the force of Data Gravity. Digital Realty focuses on providing fast and secure access for companies in whichever way they prefer.

Achieving Zero Gravity

Organizations have increasingly realized the value of utilizing their data in all aspects of their business. Therefore, it has become increasingly important to learn how to best manage this data in order to realize all of this value. The problem of Data Gravity has facilitated the creation of solutions, like Starburst, that promote a decentralized approach to data access. This way data from disparate data sources can all be queried and analyzed together, thereby accelerating time-to-insight. 

Jacqueline Vail

Marketing Communications Specialist, Starburst

Jacqueline is a Marketing Communications Specialist at Starburst

Start Free with
Starburst Galaxy

Up to $500 in usage credits included

  • Query your data lake fast with Starburst's best-in-class MPP SQL query engine
  • Get up and running in less than 5 minutes
  • Easily deploy clusters in AWS, Azure and Google Cloud
For more deployment options:
Download Starburst Enterprise

Please fill in all required fields and ensure you are using a valid email address.