Fully managed in the cloudStarburst GalaxySelf-managed anywhereStarburst Enterprise
- Start Free
Fully managed in the cloud
Internet-based cloud data storage offers significant performance, security, and accessibility advantages and provides the foundation for innovation and growth. At the same time, many data management challenges migrate to the cloud right along with the data.
This guide will review cloud data storage in its various shapes and use cases and discuss how a modern, cloud-based approach to data analytics can empower data-driven enterprises.
The National Institute of Standards and Technology defines cloud computing as:
“A model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.”
Within that context, cloud data covers any data stored or processed on internet-accessible remote servers, whether company-owned or hosted by third-party cloud services.
Far more than the physical locations of storage servers, the difference between on-premises and cloud data determines the shape of a company’s information architecture, operational efficiency, and ability to thrive in a dynamic competitive environment.
In the on-premises model, an organization assumes absolute control over data stored in dedicated data centers, each facility’s server room, office workstations, or elsewhere in the physical infrastructure.
With data living on company-owned systems, organizations decide how to store, move, and process data. Security teams have total authority over access policies and security measures.
Yet, control imposes responsibilities. Companies must invest to ensure infrastructure capacity, performance, and accessibility meet business demands. This investment includes the funding and training of security teams to manage a dynamic threat landscape.
Similarly, the on-premises model requires robust and expensive resilience strategies. Investments in redundant data centers and backup locations can maintain business continuity but at a significant cost.
The cloud model moves data from company-owned servers to the distributed infrastructure of a cloud service provider (CSP). The provider’s customers share pooled storage resources in a multi-tenant model. With high utilization rates, CSPs can optimize capacity, network latency, and other performance metrics.
While the data could reside anywhere within a CSP’s system, customers typically choose data centers in one or more specific regions to meet performance, resilience, or compliance requirements.
However, companies sacrifice the control they had in the on-premises mode. CSPs and their customers operate within a shared responsibility model. For instance, providers handle their physical infrastructure’s configuration, maintenance, and security while their customers manage access control.
In our survey, The State of Data and What’s Next, 59% of respondents said their organizations’ data lives in the cloud. The factors driving cloud adoption include:
Traditional data centers use capital inefficiently. Companies must invest in peak capacity, which lies dormant for much of the year. Switching to cloud data storage replaces this fixed capital expense with a scalable operational expense.
Subscription-based pricing lets companies pay for the data they actually store rather than the data they may store. Moreover, how much they pay scales up and down with their workloads. This scalability saves a considerable amount over on-premises hardware.
Compared to an on-premises network, a CSP’s networks offer unlimited capacity, making cloud solutions better equipped to handle sudden demand swings. Network architects don’t need to optimize systems for worst-case scenarios, relying on the CSP’s infrastructure to handle traffic spikes.
Placing data in the cloud speeds the development of applications, services, and other data products. Cloud service providers offer data management APIs that enable workload automation and accelerate time-to-market.
The cloud makes enterprise data available to any employee with a decent internet connection, helping managers make more effective analytics-based decisions. In addition, cloud-enabled collaboration lets dispersed teams work in real-time on the same data and applications.
The cloud does not eliminate cybersecurity threats but makes data protection more manageable. Unlike most medium-to-large organizations, CSPs can afford 24x7x365 security centers staffed by security experts who oversee advanced, well-maintained security technologies. It seems counter-intuitive, but ceding ownership of infrastructure security to a cloud storage service is a better way to protect sensitive data.
The cloud model also supports disaster recovery plans since data stored in multiple regions remains instantly accessible. CSPs also offer data backup options to speed recovery from outages or cyberattacks.
Cloud data services can take many forms, from internet-based file storage to software-as-a-service (SaaS) replacements for proprietary applications.
The SaaS business model replaces on-premises software with applications running in the cloud. Accessible anywhere on any device, SaaS applications also foster deeper collaboration through shared workspaces and web-based communications.
Better security is among cloud applications’ business benefits. Data stored within the software remains in the cloud rather than on user devices. With encryption and security updates provided by the operating system, browser interfaces are easier to keep in compliance with security policies.
Productivity apps are among the earliest candidates for a cloud migration initiative thanks to services such as Google Workspace and Microsoft Office 365.
Cloud storage services such as Google Drive, Microsoft OneDrive, and Amazon Simple Storage Service (S3) duplicate the directory-based file storage structures of an on-premises server, letting users access files as they always have from a Windows desktop or mobile apps running on iOS or Android.
The advantage for businesses is that — once authenticated and authorized — users access their files directly. The company does not need to maintain VPN gateways, build data storage capacity, or support traffic on its private network.
Although file storage services meet user needs, the cloud comes into its own when enterprises switch to cloud databases. Cloud databases like Amazon Relational Database Service (RDS), Microsoft Azure SQL Database, and Google Cloud SQL transform application development in addition to providing scalable storage and efficient data retrieval. DevOps teams harness the power of database providers’ APIs to turn any business application into a cloud app.
While cloud databases streamline day-to-day operations, they do not erase the inherently distributed nature of enterprise data. Like their on-premises predecessors, cloud data warehouses like Amazon Redshift and Snowflake attempt to centralize data to run complex queries faster and improve reporting.
Cloud-based analytics platforms, such as Amazon Elastic MapReduce (EMR) and Google BigQuery, offer tools for specific big data use cases. As we’ll see, the Starburst Galaxy data lake analytics platform powers data applications and lets companies run interactive analytic queries on large datasets.
Cloud architectures can take many shapes depending on the choices a company makes. For example, some prefer the advantages of a single provider, while others choose multiple providers to avoid vendor lock in. In general, these choices will result in a combination of these four cloud storage architectures.
A public cloud architecture benefits from the multi-tenant approach of providers like AWS S3, Microsoft Azure Blob Storage, or Google Cloud Storage. The company’s operational expenses are lower since the CSP can spread costs across many customers. And since the CSP designs its infrastructure to support so many customers, public cloud storage is more performant and reliable.
Security and compliance concerns may require a different approach. Healthcare, finance, and other heavily-regulated industries pay extra for their CSP’s private cloud services. The CSP allocates dedicated servers and networks to each customer, which takes a more active role.
At the largest end of the spectrum, enterprises may develop their own cloud storage infrastructure using open-source solutions or licensing a CSP’s technology.
No company can migrate to the cloud instantly. For several years, they will continue to run on-premises systems. Many will never completely transition since certain critical systems must stay on-premises. For example, mainframes remain powerful tools for processing extreme volumes of real-time transactional data.
A hybrid cloud architecture must meet the challenges of integrating legacy on-premises systems with new cloud-based storage solutions.
Although the terms “on-premises” and “cloud” sound like data is in one place or another, the reality is much less centralized. On-prem data gets stored in various databases, applications, file systems, and data warehouses.
The same is just as true in the cloud. A company’s data rarely resides within one service provider’s platform. As a result, data teams must manage a multi-cloud architecture that integrates multiple vendors into a single system.
Starburst Galaxy is a cloud-based data analytics solution that unifies data in private, public, hybrid-cloud, and multi-cloud environments. Designed to handle petabyte-scale data sets and empower individual users, Galaxy provides a single point of access for users across the organization:
Starburst Galaxy also includes built-in security and governance features that streamline compliance with data privacy and other regulations.
With the Starburst Galaxy data cloud analytics platform, data consumers can access any data anywhere in the company from anywhere in the world.
Up to $500 in usage credits included
Up to $500 in usage credits included