Every company wants real-time insights to make quick, actionable decisions that drive the business forward. The traditional approach to data platforms relies on specialized skills developers and engineers for business users to get basic reports and dashboards, which slows down the time to insight and inevitably renders stale data.
Meanwhile, a more modern, human-centric approach puts the user’s needs, wants, and abilities at the center of the data platform design process. Rather than expecting users to adjust and accommodate to the platform, it requires making intentional decisions based on how end users can, need, and want to digest data. With this approach and making data consumable for all, companies can achieve a culture of data-driven decision making.
It’s through building a user-friendly self-service data platform at Datto, a leading cybersecurity and data backup company, that we’ve seen the benefits of empowering our users. Rather than waiting on IT to generate reports for them, we’ve enabled anyone within the organization to answer their own questions — whether it’s a report on how a marketing campaign is performing, whether or not to increase operations spend, what the customer satisfaction scores are, etc.
In this post, I’ll share some hard-earned lessons on important considerations when designing a user-friendly data platform and how a well-executed design can ameliorate the barriers to insight and innovation.
Let’s start with the basics…what is a data platform?
A data platform system that works for users
A data platform is a broad constellation of services and or processes that serve the management and processing of data for a particular use case. Data platforms can be as simple as a shared spreadsheet, or a complex system, involving hundreds of moving parts.
It’s helpful to interrogate the qualities of a data platform – what makes one good or bad? There are lots of elements that make a data platform good or bad, but fundamentally—the common thread in every well-architected system—does the system work for users, or do users work around the system?
Data platforms: An ecosystem that enables the growth of data products
The data platform, by itself, doesn’t solve for anything in particular—it’s an ecosystem that enables data products to grow and evolve that serve context and domain specific needs. These data products exist within that ecosystem, and in turn shape the specific topology of a given data platform.
A data platform enables users to build data products that fit their domain. Effective data platforms are ergonomic and pleasant to use — from onboarding, access, stability, and through the full lifecycle of downstream products (i.e. data models, marketing dashboards, financial projections, etc).
Inflection point: Building a data platform to support the needs of an organization
More than just a necessity, every organization already has a data platform.
A data platform can be as simple as Tom in accounting requesting Kim from operations to email that Excel workbook. Many in the industry can relate to this kind of experience — it’s memorably bad at scale. However, it’s sometimes legitimately sufficient for the needs of an organization.
There’s certainly an inflection point where it’s certainly cost efficient and worthwhile to have a more sophisticated data platform, and that’s going to be different for each organization. There is also an inflection point where a poorly designed platform will hurt the business and drive engineering, analytics, marketing, talent away.
Additionally, certain endeavors could be diminished with a data breach, misreported financials due to poor data provenance, crippled by operational inefficiencies, and the like — I think there’s a pretty solid argument for well-designed platforms. And a well-designed data platform requires expertise, effort, ongoing stewardship, and cost.
Hot tip: Considerations when designing a human-centric data platform
It’s important to remember that as architects and platform managers we are designing an ecosystem through which real people solve real problems. That also includes sensible and maintainable infrastructure.
As an architect, I try to think through the absolute simplest way to enable my users and then take it a step further: can I remove complexity while doing this? If I can’t figure out a way to remove complexity, I’ve found that it’s usually a case where I didn’t fully understand the problem at hand.
Balancing the operational needs of the platform while fostering an ecosystem where users can self-serve
No organization functions without a data platform, designed or not. When transitioning to a centrally managed, architected, and deployed solution, it’s crucial to find quick wins and iterate quickly with stakeholders to both demonstrate the usefulness of the tooling. But the most important win is to build trust.
It can be easy to forget that most of our users haven’t seen the solution to a problem before, and so they’re framing the problem they are looking to solve as constrained by their knowledge and expertise.
As an architect, I think of access patterns as the fundamental abstraction for any platform. It’s different for data scientists, web developers, and business analysts, and the usefulness of abstracting problems to a notion of patterns is the first step to solving a class or even category of issues.
By way of example, a team of analysts will likely need their own sandbox schema from which they’ll deploy and expose data products. It’s not a point of debate. Design with the assumption that the smart, curious, and creative people you work with will need ownership to effectively do their work.
Common pitfalls of designing a data platform
One of the most common mistakes I’ve seen is to focus on the tech for its own sake. A data platform is meant to empower individuals, teams, etc. A fancy platform with needless abstractions and infrastructure will make for a clever, complex, and fragile system.
If a piece of infrastructure can be removed, do it.
If code can be simplified, do it.
Be relentless and always remember the end user experience.
Users don’t care how slick the pipeline architecture is. They care if the data shows up on time. Is the data right? Is the data correct from first principles?
When should stakeholders be involved
Stakeholders should be involved as early as possible.
A poor design is when the system you’re building doesn’t seek to understand the way users will interact with a system. But it also can’t be designed by committee — from the architecture perspective, we have the privilege of having seen a variety of access patterns, and it’s our responsibility to leverage that expertise.
I’ve found that users are usually excited about a shiny new piece of technology and want to play with it. So, as soon as a feature is functional and secure, let people use it! Work with users to iron out rough edges, educate platform users, and be a partner.
Fostering a culture of creativity within an organization
People are curious, by nature. From an organizational perspective, that often means implementing processes and technical solutions that channel individual and team creativity to create robust, designed solutions.
Data consumers in an organization will always find a way to solve for their needs. Providing a robust platform and associated documentation and available expertise for complex use cases enables colleagues and stakeholders to focus their curiosity and creativity solving for business value in a scalable way.
The platform-as-ecosystem simile is a helpful reminder that the platform has to evolve with its users and data products.
In short, designing a data platform for data consumers requires breaking away from the traditional, centralized approach and starts with the data consumers first. While moving to data self-service can be challenging, the reward of having rapid insights and your fingertips and empowering users to build their own reports and dashboards should make it worthwhile.
If you’re looking to start on the path of distributed data ownership at your organization, check out this Datanova session on building a reliable data platform with Trino and data observability.