Most organizations have data and continue to generate and collect it on a daily basis, but have a far more difficult time in getting value from the data, making them vulnerable to cost and growth inefficiencies. Fortunately, those who can enlist data to work in lock step with the business strategy will be far more innovative and competitive. 

Much of this data is considered ‘dark data’ — data that’s collected but never used and makes up most of the data that organizations collect. Also known as unstructured data — nearly 80-90% of the data in the enterprise is unstructured or semi-structured. 

Sure, the business opportunities are limitless, but it would help if we had a better idea of the current state of data. That’s why we’ve compiled over 80 data & analytics statistics for 2022. This will help show the prevalence and need for data in all facets of business. These stats include volume, velocity, and variety of data, the importance of data in scaling AI, and a closer look at upcoming data & analytics career opportunities.

For more in-depth insights check out our data & analytics resources.


14 Benefits of successfully managing data as a product and executing a data-driven strategy

Organizations who have established a sustainable path to data have found a better way to put data to work. One way is by managing data as a product and also, executing a strong data-driven strategy.

Enduring business benefits of integrating a data analytics strategy 

  1. Research has shown that organizations with a strong data culture have nearly 2x the success rate and 3x the return from AI investments.(Accenture)
  2. By managing data as a product, new business use cases can be delivered as much as 90 percent faster. (McKinsey)
  3. Firms with advanced insights-driven business capabilities are eight times more likely to say they grew by 20% or more than beginner firms. (Forrester)
  4. ESG’s models predicted that Starburst Data can reduce the costs of data analytics by up to 53% and time to insight by up to 90%, as well as enable an increase in revenue of up to 2%. (Economic Validation Report, ESG, a division of TechTarget)
  5. Benefit of managing data as a product: Total cost of ownership, including technology, development, and maintenance costs, can decline by 30 percent. (McKinsey)

Proof points with impact, by industry

  1. Financial Services: Siam Commercial bank…They implemented a program to create clear guidelines for effective and secure use of data and analytics and implemented automation that reduced manual processes by 40%, improving accuracy while securing higher customer satisfaction. (Accenture)
  2. Financial Services: FINRA has a scalable, cost-effective way to analyze its constantly growing volumes of data…analyze 25PB of data—100B rows of new data per day from 25+ sources. (FINRA
  3. Healthcare: Being able to pull data into our current platform, with the appropriate information governance and data permissions, we can give our customers a view across 100 or even all 4,000 EMIS GP practices (EMIS group
  4. Healthcare: Kaiser Permanente has been using a combination of analytics, machine learning, and AI to overhaul the data operations of its 39 hospitals and more than 700 medical offices in the US since 2015. (CIO)
  5. Big Data analytics for the healthcare industry could reach $79.23 billion by 2028 (Vantage Market Research)
  6. 72% of manufacturing executives said that they considered advanced analytics to be important. But only 17% said that they had captured satisfactory value from it. (BCG)
  7. Manufacturing: Analytics has helped manufacturer company Owens Corning reduce the testing time for any given new material from 10 days to about two hours. (CIO)
  8. Food & Beverages: Top FMCG drinks brand’s Data Science Platform – Along with an “Intelligent Prospecting” Leads Engine data product, Data Science for Net Revenue Management, and a Global Ag Data Strategy, the outcomes have delivered a 12% improvement in forecasting and an error reduction worth $2.7B to the company. (Kin + Carta
  9. eCommerce: “We are saving 50% on our AWS compute costs.” (Zalando

9 Noteworthy data analytics market trends

  1. The global big data analytics market is projected to grow from $271.83 billion in 2022 to $655.53 billion by 2029 (Fortune Business Insights)
  2. From 2020 to 2025, IDC forecasts new data creation to grow at a compound annual growth rate (CAGR) of 23%, resulting in approximately 175ZB of data creation by 2025. (Datanami)
  3. The global Big Data market is projected to generate $103 billion in revenue by 2027(siliconANGLE)
  4. It’s predicted that the global big data and business analytics market will reach $448 billion in spending by 2027. (Datanami)
  5.  463 exabytes of data will be generated each day by humans as of 2025. (LinkedIn)
  6. The data warehousing industry boasts 37,402 companies, focused on 64 technologies in total. 4,244 companies use Snowflake, followed by 1,995 companies that use SAP Business Warehouse , and 1,831 companies that use Amazon Redshift. (Datanyze)
  7. The total volume of data managed is projected to grow at a CAGR of 41% through 2023, with even more dramatic growth in certain industries, such as Marketing and Advertising.The data shows the complexity of big data is beyond what current solutions can handle.  (Data Teams’ Outlook on Data Warehousing in 2023, Firebolt)
  8. 2012: A chart of the big data ecosystem (Matt Turck)
  9. 2021: A chart of the Machine Learning, Artificial Intelligence, and Data Landscape 2021 (Matt Turck)

21 Stats on organizational culture, literacy, strategy, challenges, and opportunities

  1. 91.9% of organizations said culture was the greatest impediment to becoming data-driven. (New Vantage Partners)
  2. Only 26.5% of organizations have achieved the goal to become data-driven (New Vantage Partners)
  3. Just 19.3% indicate that they have established a data culture. (New Vantage Partners)
  4. nearly 30% of the data generated will be consumed in real-time by 2025(Datanami)
  5. Only 39.7% reported that they were managing data as an enterprise business asset. (New Vantage Partners)
  6. 75% of executives plan to accelerate digital transformation, including an emphasis on moving to cloud (26% increase compared to pre-covid).(Accenture)
  7. 81% of organizations lack an enterprise data strategy to fully capitalize on their data assets. (Accenture)
  8. 75% of data and analytics decision-makers responded that they are changing their management culture to rely on more decision-making with data. (Data And Analytics Survey, Forrester)
  9. By 2023, data literacy will become essential in driving business value, demonstrated by its formal inclusion in over 80% of data and analytics strategies and change management programs. (Gartner)
  10. 60% of companies with more than 1,000 employees have already at least partially established data mesh principles. (PWC)
  11. Only 8 percent offer all employees the opportunity to conduct self-service data analyses by themselves. (PWC)
  12. 54 % of IT managers: Data quality is the biggest challenge in relation to data & analytics. (PWC)
  13. 3 out of 4 C-level executives consider data transparency, reliability and trust as critical improvement areas. (PWC)
  14. 93% IT managers have discussed Data Mesh within their organization with 83% of them wanting to adopt this concept. (PWC)
  15. 65% state that streaming data as the primary data that they will collect.  (State of Data, EMA Research)
  16. 27% indicate their Big Data projects are already profitable, and 45% indicate they’re at a break-even stage. (CapGemini)
  17. 79% of companies believe that not using big data will bankrupt them. (Accenture)
  18. Three top drivers for data, analytics and AI:
    1. Modernize enterprise data on cloud
    2. Make better decisions with analytics
    3. Enable data-driven business transformation (PWC)
  19. Over 1,600 C-suite executives and data-science leaders from the world’s largest organizations found that nearly 75% of companies have already integrated AI into their business strategies and have reworked their cloud plans to achieve AI success. (The art of AI maturity, Accenture)
  20. 89% said that their organizations missed business opportunities because of data access challenges. (State of Data Engineering, Immuta)
  21. 6 missed business opportunities due to data access challenges: unable to complete a customer request; unable to innovation; unable to fix critical issues; unable to deploy a new product/service; lost a sale to a competitor; missed quarterly goals (State of Data Engineering, Immuta)

18 Illuminating stats: data systems, data access, data quality, data pipeline

  1. 43% of IT decision makers fear their IT infrastructure won’t be able to handle future data demands (Dell Technologies)
  2. Data sprawl complexity issues are universal across industry and region…the average number in most organizations is 4-6 platforms, with at least 11% of organizations having 10-12 platforms. (State of Data, EMA Research)
  3. CMO’s cite the lack of systems to connect data silos as the No.1 barrier to achieving their full potential. (GFK Research)
  4. 49% of data will be stored in public cloud environments by 2025(Datanami)
  5. 63% data professionals report lacking visibility into data access controls (State of Data Engineering, Immuta)
  6. 41% of data and IT teams are understaffed and don’t have enough people to manage or analyze their data. (State of Data Engineering, Immuta)
  7. Most people spend 4 to 6 hours per day consuming and generating data through a variety of devices and (social) applications. (Markets and Markets)
  8. 90% data professionals agree that they could improve their understanding of the correlation between data access and data security. (State of Data Engineering, Immuta)
  9. Poor data quality costs organizations an average $12.9 million. (Gartner)
  10. The top 3 drivers when migrating to SaaS are enhanced functionality, increased usability and a more robust user interface (UI). (Gartner)
  11. 55% of companies have a mostly manual approach to discovering data within their enterprise, and only 28 percent have a strategy in place to take advantage of it. (Accenture)
  12. 84% of IT managers have a data warehouse in use within their organization. (PWC)
  13. 78% of IT managers are using or implementing a data lake in their organization. (PWC)
  14. Two-thirds (66%) are using a data lakehouse. And 84% of those who aren’t using one currently, are looking to do so. (VentureBeat
  15. A constant trend remains to drive business decisions with SQL analytics at the forefront, with 53% of respondents rating it the highest importance to their analytical program. (State of Data, EMA Research)
  16. Over 48% said that they take more than a business day to develop a data pipeline, with 32% taking three days to two months. Digital transformation and today’s unprecedented rapid change require even faster responses to business events, and data pipelines are needed to process and deliver valuable insights. (State of Data, EMA Research)
  17. The first and most obvious solution for data dispersion and multi-platform complexity is a move to the cloud. In 2021, 56% of their data was in the cloud versus 44% on-premises. When asked the same question this year, respondents stated that 59% of their organization’s data resides in the cloud compared to 41% on-premises. This trend will persist as companies continue their digital transformation journey. (State of Data, EMA Research)
  18. Data teams spend upwards of 40% of their time on data quality issues instead of working on revenue-generating activities for the business. (TheNewStack/Forrester)

6 Data sovereignty and privacy stats

In the data analytics and compliance space, data sovereignty and privacy are concepts that have our attention. 

Here’s what Director of Data Services Alexander Seeholzer at SOPHiA GENETICS said about performing data & analytics functions without jeopardizing compliance, “Cross-regional querying is also very important. Due to data compliance laws, data can only remain in its country of origin.”

  1. 99% said that their organizations expect to incur planned or unplanned egress fees at least on an annual basis. Companies are moving lots of data to lots of different places, and it’s accelerating. (Datanami)
  2. More than 100 countries have some sort of data sovereignty laws in place. (Wikipedia)
  3. 48% of consumers have stopped buying from a company over privacy concerns. (Tableau)
  4. 97% of companies have seen benefits like a competitive advantage or investor appeal from investing in privacy. (Cisco)
  5.  54% of consumers say companies don’t use data in a way that benefits them. (Tableau)
  6. 42% of companies say that investing in privacy has enabled agility and innovation in their organizations. (Cisco)

7 Vital stats on the chief data officer role

The chief data officer role is a senior executive responsible for dual challenges of complying with evolving regulation and compliance standards on data quality, data collection, privacy, security, governance while getting the most out of their data and strategic investments. Sure, the CDO will no doubt align their data strategy with the business, but it’s an evolving role that’s worth taking a closer look. 

  1. 73.7% of leading companies have an executive serving in the CDAO role. (New Vantage Partners)
  2. The average tenure of a chief data officer is 30 months. (MIT Sloan)
  3. 78% of CDOs say driving business growth and value creation is their top priority. (Accenture)
  4. 86%of the CDOs/Acting CDOs are involved in development of their organization’s data strategy…(Accenture & MIT CDOIQ)
  5. 78% of CDOs assert their roles and responsibilities have become more critical, driven by the need for competitive advantage (Accenture & MIT CDOIQ)
  6. Chief Data Officers who successfully increased data sharing led Data & Analytics teams that were 1.7 times more effective at showing demonstrable, verifiable value to Data & Analytics stakeholders.(Gartner)
  7. Lack of talent to operationalize, cultural/adoption, and lack of long- term funding are the top 3 challenges faced by CDOs (Accenture & MIT CDOIQ)

6 Data & analytics workforce statistics and predictions

Interested in being a part of organizations striving to become data-driven? Now is the time — job openings are available at so many organizations including: Tableau, MonteCarlo, Aerospike, Alation, and of course at Starburst

  1. The Rise of SQL — It’s become the second programming language everyone needs to know ( IEEE Spectrum)
  2. The most widely used and popular languages, like Python ($150,000), SQL ($144,000), Java ($155,000), and JavaScript ($146,000), were solidly in the middle of the salary range. (O’Reilly Data/AI Salary Survey)
  3. Data analytics is positioned to be “the most in-demand role in 2022.” (Datanami)
  4. The average annual salary for employees who worked in data or AI was $146,000 (O’Reilly Data/AI Salary Survey)
  5. Data scientists and data engineers rank third and seventh, respectively(Glassdoor’s 50 Best Jobs in America for 2022)
  6.  A search in December for data scientist positions on LinkedIn in the U.S. generated nearly 350,000 open positions; a search for data engineer jobs produced more than 220,000.

5 Top Data Mesh TV episodes

Data Mesh TV is a monthly educational program for data leaders by data leaders (host: Adrian Estala, a former CDO) about data monetization, aligning data strategy with business goals, and accelerating digital transformation initiatives with Data Mesh.

  1. Building a Digital Operations Model with a Data Mesh with Rob Akershoek, IT4IT
  2. Data Mesh For Dummies with Colleen Tartow Ph.D., Andy Mott MBA, and Adrian Estala 
  3. Vista’s Data Mesh Implementation Journey with Dr. Sebastian Klapdor, Vista
  4. Apply Product Thinking to Data with Karl Hampson, Kin + Carta
  5. Building your First Data Mesh Showcase with Ken Seier, Insight

15 Top Data Mesh Radio episodes 

There are currently 218 episodes available on Data Mesh Radio. These amazing episodes include interviews with data mesh practitioners, deep dives/how-tos, panels, “mesh musings”, and so much more. 

Host Scott Hirleman, also founder of the Data Mesh Learning Community, shares his learnings – and those of the broader data community – from over a year of deep diving into Data Mesh.

And a nice shoutout to Scott Hirleman from the “troublemaker” and founder of Data Mesh, Zhamak Dehghani, “relentless effort to get Data Mesh community off the ground and growing it to 6.5K+ people; bring everyone together to share ideas via meetups, podcasts, slack, newsletter, etc. and doing all that with a sharply critical mind – a scarce commodity these days.”

  1. #2 Intro to Data Mesh — Mesh Musings #1
  2. #3 Discover and Create Your Necessary Data Products — Data Product Flow Interview w/ Paolo Platter
  3. #129 Iterating Data Governance for Data Mesh: Lessons Learned from ‘The Data Governance Coach’ — Interview w/ Nicola Askham
  4. #118 – Zhamak’s Corner 1 — Is Data Mesh Right For You?
  5. #113 Data Governance In Action: What Does Good Governance Look Like in Data Mesh — Interview w/ Shawn Kyzer and Gustavo Drachenberg
  6. #101 H&M’s Data Mesh Journey So Far Including Finding Reusability in Interesting Places — Interview w/ Erik Herou
  7. #20 Domain Driven Design for Data: Where to Start — Interview w/ Piethein Strengholt
  8. #126 Evolving from Data Projects to Data As a Product — A Data Platform Six Years in the Making — Interview w/ Blanca Mayayo and Pablo Alvarez Doval
  9. #121 Zhamak’s Corner 2 — Are You Ready for Data Mesh?
  10. #123 Reflecting on Multiple Data Mesh Implementations: Iterating Your Way to Success — Interview w/ Sunny Jaisinghani and Simon Massey
  11. #115 Understanding the Data Value Chain: Your Key to Deriving Value from Data — Interview w/ Marisa Fish
  12. #109 Tying Data Strategy and Architecture to Business Strategy — Interview w/ Anitha Jagadeesh
  13. #47 Skipping the Fluff of Domain Driven Design for Data Mesh — Interview w/ Lorenzo Nicora
  14. #61 Driving Value Through Participating in the Data Economy — Data Innovation Summit Takeover Interview w/ Jarkko Moilanen
  15.  #88 Data Engineering and Data Engineers’ Future in Data Mesh — Interview w/ Joe Reis

If you’re a data rebel, register now.

Free. Virtual. Global.

Register for Datanova

Start Free with
Starburst Galaxy

Up to $500 in usage credits included

  • Query your data lake fast with Starburst's best-in-class MPP SQL query engine
  • Get up and running in less than 5 minutes
  • Easily deploy clusters in AWS, Azure and Google Cloud
For more deployment options:
Download Starburst Enterprise

Please fill in all required fields and ensure you are using a valid email address.