Data sprawl grows in complexity. Data science emerges as the top analytical workload. Data pipeline is fraught with challenges. Data access is now more critical than ever.
The much-awaited second annual edition of the market study, The 2022 State of Data & What’s Next, commissioned by Red Hat and Starburst, and conducted by Enterprise Management Associates (EMA) released earlier today. Some 400 organizations of varying sizes across a good mix of industry verticals were surveyed across the US, UK, Canada, France, Germany, Australia, and Singapore.
According to the report, centralization might have its benefits but it also comes with “a single point of failure” and other challenges. More than 75% of the organizations are moving towards a decentralized model and “this movement fits a data mesh approach based on a modern architecture for analytical data management, enabling companies to easily access and query data where it resides without transporting it to a data lake or warehouse.”
From the great data dispersion to the impact of the pandemic on data access and from the data pipeline dilemma to the move to the cloud, the study encapsulates the current data landscape.
The data sprawl continues with an average of 4-6 data platforms with at least 11% of organizations having 10-12 platforms! Organizations have created an intricate data landscape that includes many applications and systems, and the integration layer that connects them.
The global pandemic has had a lasting impact on data access with 55% of organizations claiming that “access to data” is more critical after COVID-19, a 2% increase from last year. The study indicates that the world will continue to witness greater reliance on data, with pressures for data access growing to meet customer demands in an advancing digital, highly mobile landscape.
With greater demand for data access also comes a need to support AI and ML in a complex, hybrid multi-cloud environment. There has been a shift wherein data science workloads (ML and neural networks) have displaced SQL analytics to emerge at the top. This means more pressure on organizations as they need to process vast amounts of data to fuel these workloads.
Add to this the complexities of data movement and data pipelines. For over 48% of survey respondents, it takes more than 24 hours to create a data pipeline, then another 24 hours to move data pipelines into production, making real-time business operations a challenge. This, combined with the need for faster data access, is pushing the industry away from the painful pipeline process and into a more decentralized model that aligns with the Data Mesh approach.
Researchers at EMA note, “Organizations need to become nimble in providing fast, reliable access to data anytime, anywhere.”
Will decentralized data access be the path forward? Is there a solution to data sprawl? Will digital transformation necessitate a move to the cloud? Download the full report here for the answers to these pressing questions and more.