More than any other industry, Financial Services is likely to only partially realize the elusive utopian state of ‘the single source of truth’ for all its data. Between robust regulatory requirements, strict data governance policies, and heightened risk, it’s no wonder those with experience as a data practitioner in FinServ accept disparate data as a necessity to survive and thrive. Yet, decade after decade, platform providers continue to push their agenda to standardize on data warehouses or data lakes, and now cloud data warehouses, cloud data lakes, and even data lakehouses – every data architecture requires your data for analytics use to go into a lake or a warehouse. 

Whether it’s a 100-year-old institution or a digitally native fintech start-up, the intricacies of doing business make it so that there will be a distributed data footprint, no matter the organization. 

In this blog, we explore the reasons behind this reality and how financial firms can’t afford to wait to complete their modernization initiatives around their data before creating increased and new value from their distributed data.  We do not intend to discredit any cloud migration strategy, as the cloud has cemented its place as a preferred storage option in the data ecosystem as we know it. Instead, every business must consider the same ramifications of data centralization, whether on-premise or in the cloud. 

Financial business is a complex business

Organizations across various industries began the journey to the cloud years ago by choice or a competitive force. However, as the stewards of the global financial system, concerns around data security, sovereignty, governance, and a growing body of regulatory scrutiny have forced financial institutions like banks, asset managers, indices, and others to embrace a much more cautious approach to modernization.

As such, the data architectures of these institutions are in flux as non-critical workloads and data are now moving to or into the cloud while critical information stays in self-managed on-premises environments. As a result of this hybrid cloud strategy, the drive toward a single data store as the ultimate source of truth for all data can be considered a pipedream for the near future.

Even a digital innovator like Nasdaq is just beginning to make serious strides in pursuing a modern exchange. Nasdaq announced its partnership with AWS on November 30, 2021, making them its preferred cloud provider for capital markets. One year later, it announced it had completed the migration of Nasdaq MRX – one of its six U.S. options exchanges – to Amazon Web Services (AWS). 1

So why is there a disconnect between what most technology vendors advocate and what financial firms are putting into practice? Ultimately it comes down to the fundamentals of doing business in the financial sector. Outside of business-specific data policies, based on work with our Financial Services customers, which includes the top 4 out of 6 North American banks, the common themes that have emerged include:

  • Increasing regulations and a growing trend toward data sovereignty —

Unlike any other industry, Financial Services companies are subject to various regulatory requirements that affect how they collect, store, and use data. These requirements may vary depending on the type of data, the country in which it’s collected, and how it’s used. As a result, financial institutions may need to support multiple data systems to comply with these requirements. The regulatory complexity is intensifying as countries, geopolitical zones, and even states/provinces/districts implement local regulations that complement or replace once-regarded global standards. A few recent examples include UK reporting requirements, India’s transaction data requirements for local storage, and California’s CCPA; though not industry-specific, its implications are relevant for the industry as it aims to create personalized experiences and offers for its clients. The strictest data sovereignty laws, like those in Germany, France, and a few others, mandate its citizens’ data is stored on servers within the country’s physical borders. 2

  • Mergers & Acquisitions (M&A) —

Financial Services M&A set a new record in 2021, with deal value soaring above $1 trillion.3 This was primarily due to a bullish forecast from the pandemic and cheap money with record low-interest rates. However, this trend may continue into 2023, with Fintech companies being targets of larger financial firms as valuations normalize. The M&A cycle is already in flight, with Fidelity’s acquisition of a fintech startup Shoobx – a leading provider of automated equity management operations and financing software for private companies at all growth stages, up to and including an initial public offering (IPO).4 Without diving into all the reasons why M&As happen, something that is nearly a guarantee in every instance is the need to integrate multiple data systems from two different stacks. Integrating these

systems can be complex and time-consuming, making it difficult to achieve a single source for data and prolong the value realization from the newly acquired data assets. This eventually converts to accumulating tech debt. When paired with the prolonged time to execute the data migration, the data can become obsolete because of its age or new/shifting priorities.

  • Third-party data — 

Organizations are increasingly turning to third-party data providers to add additional attributes to complement their data sets. This data can be traditional financial data and the increasingly relevant alternative data – data not traditionally used by financial markets, which can include but is not limited to weather data, satellite imagery data, social media data, IoT data, and so on. Ultimately these data providers have policies on how their data has to be stored and secured. As shared by a Fortune 200 New England (US region in the Northeast corner) investment and insurance company, some of their data providers have on-premise requirements to control unlicensed access. Furthermore, critical third-party data like credit card information or other 3rd party needs to be tokenized, obfuscated, and encrypted in the cloud, which may remove the immediate value.  So now, by default, any company in the cloud will need to seriously consider when and how third-party data is leveraged if specific storage and access requirements are part of the licensing agreement, which in most cases, is common unless an unlimited license of sorts is purchased. 

  • Multi-cloud and hybrid cloud —  

Even Nasdaq, with its latest partnership with AWS, has strategically opted for a multi-cloud strategy using Microsoft Azure and AWS across its tech stack. Though all their data may eventually live in these cloud providers’ data lakes, the fact it lives in independent systems by default destroys any chance for a single source of truth – that is, until the cloud providers themselves enable and embrace multi-cloud capabilities for their customers. They aren’t alone; just as consumers are advised to hedge their bets with a diversified investment portfolio, financial firms are heading the same wisdom. Over the next three years, 56% of Financial Services organizations expect to operate in a multi-cloud environment, and 82% agreed that hybrid multi-cloud is the ideal IT operating model for their companies.5

  • Legacy tech stack  

This is straightforward; many financial institutions have been around for decades and accumulated a lot of tech and data. As a result, they have legacy systems that are difficult to replace or integrate with new systems. So as organizations explore their modernization efforts, migration is an inevitable outcome but what should also be part of the plan is how to ensure continuity of data access and usage without duplication and ballooning costs. 


Shifting mindsets from a single source of truth to a single point of access

If the single source of truth is not a thing for financial firms, how do these businesses harness the newfound promise of modern analytics – faster, cheaper, reliable, intelligent, and more complete? Well, as a data governance executive from a leading US financial institution shared in preparation for the upcoming Datanova conference, the vision would be to embrace some form of a data mesh or data fabric type architecture that allows them to leave the data where it is and enable users [business analysts, data analysts, data engineers, and data scientists] to access it with all the necessary governance and access controls fully enforced. And for the near future, that may be the fastest, more complete, and most realistic path to realizing the data strategies executives have committed to their boards until a single source of truth becomes a real thing for Financial Services. 

Let’s imagine a Customer360 initiative for millennials. They are the next most significant financial group and of age for core financial products as they become homeowners, have families, increase focus on retirement funds, and look for other financial vehicles across daily use and life events. The consumer profile data, wealth management data, credit, banking transaction data, and any third-party data on this cohort are dispersed globally for the reasons listed above. Centralizing this data isn’t an option, nor is ignoring it or blaming regulatory restrictions. Instead, financial firms should explore how to enable authorized users to access the data stored to build the necessary profiles of millennials for their next marketing campaign, detect fraudulent behavior, or any other business application. The business gets to extract valuable information on this prime cohort, while data stewards get to ensure the necessary controls are in place. A similar initiative was executed at Assurance, a leading health insurance marketplace; using Starburst, Assurance increased conversion rates by  10% and reduced costs by 40% during the United States open enrollment season for health insurance.  

Taking this a step further, where the single point of access also helps turn your data assets into data products to improve productivity and ensure a consistent standard on data sets that enforces governance, security, quality, and lineage. Whether it’s the marketing, business planning, client relations, or contact center team, everyone can access, based on their permissions, the millennials’ profile data product set to ensure the business objectives are achieved. 

Wrapup: considerations when data centralization efforts are planned

Every financial firm is on a journey to increase revenue, lower costs, mitigate risks, and ultimately reach the next inflection point in its growth trajectory. Ideally, utilizing the data they have available to make data-driven decisions will result in achieving these goals faster. 

Today and every day moving forward, the insights extracted from hybrid or federated data sources will drive that innovation. From our experience with financial firms around the world, organizations can’t wait for their centralization efforts to be done; amongst many other things, they run the risk of stale data or data duplication – which is highly likely to increase costs. 

The reality is that clients, investors, and shareholders also have no interest in waiting as competitors and distributors continue to push the limits of what is possible when data is used quickly and effectively.  Instead, consider what a federated data and analytics strategy (data mesh or data fabric) can do for you with a single point of access – not a single source of truth. When this happens, the real magic of decoupling storage and compute comes to life.

So, while these data centralization efforts are being planned or in flight, consider the following: 

  1. What are the business reasons driving the need to centralize? Is it to lower costs from on-premises or legacy licensing contracts, to take advantage of modern technologies, governance and regulatory requirements, or something else? 
  2. Do you realize the full benefits of separating storage and compute if the architecture depends on specific storage types to gain performance and new capabilities?
  3. How quickly do you need to start extracting insights from your data sets? Can you wait for months or years? Will the data become stale or obsolete over time?
  4. How do you continue to use the data to run your business? Are there benefits to enabling federated data access to run analytics?
  5. What types of business value can potentially be extracted from the federated data sets if you run analytics on them while meeting sovereignty and regulatory requirements?
  6. How do you want to manage the governance of the data? Are you embracing a centralized model or a decentralized model? How are you getting data creators to take responsibility for their data, so it’s no longer a conversation about data governance but rather data ownership?
  7. What is your risk mitigation framework? What happens when the lights go off or when things don’t run as planned with the centralization efforts? 

To learn more about Starburst for Financial Services, read why FINRA analyzes billions of daily trading events to detect fraud, insider trading, and abuse with Starburst. 

Note: Not all references to specific institutions in this blog are of Starburst customers. 


 1Nasdaq Completes Migration of the First U.S. Options Market to AWS | Nasdaq
2 3 things companies must know about data sovereignty when moving to the cloud | The Enterprisers Project
3 Financial Services M&A set a new record in 2021 | kpmg.us
4 Fidelity Investments® Acquires Shoobx®, Bolstering Commitment to Equity Plan Capabilities for Private Companies | Fidelity Investments
5 Fourth Annual Enterprise Cloud Index for Financial Services | nutanix.com

Start Free with
Starburst Galaxy

Up to $500 in usage credits included

  • Query your data lake fast with Starburst's best-in-class MPP SQL query engine
  • Get up and running in less than 5 minutes
  • Easily deploy clusters in AWS, Azure and Google Cloud
For more deployment options:
Download Starburst Enterprise

Please fill in all required fields and ensure you are using a valid email address.