Data Mesh 101: How Data Mesh Helps Organizations Be Data-Driven and Achieve Velocity

This article was originally published in TDWI.org

In the last part of our series, we examine how data mesh enhances performance and can help your organizations and data teams work more effectively.

February 12, 2024 7 mins. read Sumit Pal

In the first part of our series we introduced the basics of data mesh. In part 2 we explored best practices for adopting the technology. In this, the last part of our series, we examine several of the organizational benefits data mesh offers.

According to Zhamak, who introduced the concept of the data mesh, if stakeholders feel any pain around finding and accessing the right data (leading to slower innovation), then data mesh is likely the right fit for the organization. Usually, organizations will combine different domain topologies, depending on the trade-offs, and focus on specific aspects of the data mesh approach. Once accomplished, an effective implementation spurs a mindset in which organizations prioritize and value data for decision-making, formulating strategies, and day-to-day operations.

How Does a Data Mesh Help Organizations and Lines of Business?

As organizations become more data-driven, different use cases will always require different types of transformations, putting a heavy load on the centralized teams. For large enterprises, a data mesh distributes data ownership and reduces dependencies between services. This promotes data autonomy and enables decision-making about data domains without centralized gatekeepers. It also breaks down the code and data monolith and distributes it across the domain teams, which results in better management and scalability.

The data mesh concept will mitigate cognitive overload when building data-driven organizations that require intense technical, domain, and operational knowledge. This is especially beneficial when teams need to increase data product velocity with trust and data quality, reduce communication costs, and help data solutions align with business objectives. Transferring ownership of data and data sets to domain-specific units that possess a deeper understanding of rules around the data empowers teams, improves data quality and trust, and greatly accelerates the building of data models and analytics. With these decentralized teams, each business unit can autonomously make decisions to resolve issues and deliver products with velocity and agility.

It’s important to note that implementing a data mesh should be a business initiative rather than a technical one. Organizational culture is a key factor in determining the viability and success of a data mesh initiative. Empowering individual domains and integrating them as a cohesive whole requires a reevaluation of various aspects of the business, including how data and information are managed and shared.

Traditionally, there has always been friction between data engineers and data analysts about ownership, support, and responsibilities. A decentralized, domain-focused ownership model creates data contracts related to data quality, discovery, product schema, and other essential aspects of data health; these contracts mitigate tension and reduce dependence on resources. Now, business teams can transition from relying on IT for data management to owning and designing the full-stack underlying logic of information management. However, domain teams need to understand their implementation choices and processes and how they impact the building of data products.

What Does the Data Mesh Do that Other Approaches Can’t?

For many organizations, a centralized data platform will fall short as it gives data teams much less autonomy over managing increasingly diverse and voluminous data sets. Across the organization, different use cases require different ingestion, data transformations, and data formats, which creates a bottleneck for the centralized data team to deliver everything everywhere to everyone. The problem with these approaches is that they are unable to acknowledge the diversity of users’ specialized data needs across the organization and ecosystem of data consumers. In most enterprises, data is needed and produced by many business units but owned and trusted by no one. This results in data quality and data lineage issues that eventually affect downstream analytical value.

Domain-driven or domain-oriented paradigms such as a data mesh provide a balanced middle ground. A centralized data engineering team focuses on building a governed self-service infrastructure while domain teams use the services to build full-stack data products. The evolution of data architectures from data warehouses to data lakes and data lakehouses involves centralized, highly skilled, and technical teams building data solutions.

However, the data mesh is not about introducing new technologies. Instead, it involves interconnecting existing technologies by restructuring teams across domains and eliminating the necessity for constant training on emerging technologies through self-service and reusability. This reinstates data ownership to teams that best understand the business rules, data quality rules, domain context, and the semantics of data. Additionally, using a data mesh makes data the real organizational concern, turning it into a product and organizing teams according to domains. It accomplishes this by decentralizing the process, thereby relieving central teams’ pressure from the constant data handling requests.

Because data capabilities for domains are made available through the data mesh node, they do not require having to do away with data lakes and warehouses. Rather, they become part of the self-service platform supporting data mesh for the storage and computing needs of each node. Data mesh, with its domain-centric approach, simplifies the flow and management of data in support of business objectives and outcomes. It embraces the reality of business and specialized data needs within teams and organizations. This leads to enhanced scalability, increased development speed, heightened resilience, improved data governance, greater flexibility, better data quality, improved data accessibility, and closer alignment with business requirements.

Data Mesh Case Studies

Actual examples of successful data mesh adoption include:

  • DPG Media, a European media company, moved away from regional and function-specific teams towards business-oriented domains such as B2B, B2C, user behavior, and profile management. With the guiding principles of data mesh, they were able to answer the strategic question — “How can we become truly data-driven?” — by building data products with domain owners
  • Zalando, a leading European fashion platform, leveraged a data lake and ran into significant challenges around lack of data ownership, data quality, and organizational scalability with an increasing number of data sources and consumers. To address that, they adopted a data mesh with decentralized data ownership, prioritizing data domains with the guiding principle of data as a product and not a byproduct. These changes helped Zalando overcome the bottleneck at the data team level.
  • For Netflix, the main objective for adopting a data mesh was to enable different studios that work with Netflix to use a single system, capable of dealing with large volumes of data in an integrated and standardized way. There were five major problems they wanted to solve: removing duplicated effort for data pipelines, reducing unnecessary overhead in pipeline maintenance, enabling good implementation practices throughout various data processing stages, improving reusability, and reducing the learning curve.

Like Zalando, a centralized team provided self-service infrastructure for the domains to develop data pipelines, abstracting away the complexities of configurations, deployment, and troubleshooting. They leveraged technologies such as GraphQL and data formats such as Apache Iceberg with a metadata catalog to build the pipelines and ensure compliance and development standards. Netflix implemented this without domain users knowing the underlying technologies and complexity. Centralized teams also adopted an auditing mechanism to verify data accuracy and adherence to SLAs and to ensure data quality.

  • Intuit, a U.S. company that specializes in financial software, also had executive-level buy-in around the data mesh concept and allocated the resources necessary to make an organizational, cultural, and mindset shift. Intuit was inspired to adopt the data mesh to facilitate data discoverability, data literacy, data publication, consumption, and trust across the organization. At the same time, it gave domain owners the capability to build domain-specific applications, services, and data products as well as the responsibility to maintain and provision their data.

Delivering Value

In an era when data analytics means competitive differentiation, it’s critical for decision-makers to have access to the data they need, when they need it. Data mesh takes data environments one step closer to the idea of data democratization; however, it is not an end in and of itself and will not magically deliver overnight results. Your organizations must be ready to leverage its data mesh benefits before implementation is underway. Not to mention, it’s challenging to transition from more centralized data management to a decentralized environment.

For your organization, it is not a question of “yes or no” to data mesh but more about identifying what is preventing you from delivering business value in a timely, effective, and rapid manner.

 

Article's content

Strategic Technology Director at Ontotext

Sumit Pal is an Ex-Gartner VP Analyst in Data Management & Analytics space. Sumit has more than 30 years of experience in the data and Software Industry in various roles spanning companies from startups to enterprise organizations in building, managing and guiding teams and building scalable software systems across the stack from middle tier, data layer, analytics and UI using Big Data, NoSQL, DB Internals, Data Warehousing, Data Modeling, Data Science and middle tier.

Migrating From LPG to RDF Graph Model

This post discusses the LPG-RDF vis-a-vis graph models and why enterprises should invest in knowledge graphs with RDFs for their data management practices

How Knowledge Graphs Power Data Mesh and Data Fabric

Learn from Sumit Pal, Strategic Technology Director at Ontotext, how knowledge graphs power data mesh and data fabric.

Choosing A Graph Data Model to Best Serve Your Use Case

This post discusses the LPG-RDF vis-a-vis graph models and why enterprises should invest in knowledge graphs with RDFs for their data management practices

Data Mesh 101: How Data Mesh Helps Organizations Be Data-Driven and Achieve Velocity

Ontotext’s Sumit Pal talks to TBWA about how data mesh enhances performance and helps organizations and data teams work more effectively.

Data Mesh 101: How Data Mesh Can Be Used in an Organization

Ontotext’s Sumit Pal talks to TBWA about the best practices for successfully adopting the data mesh paradigm in your organization.

Data Mesh 101: What it is and Why You Should Care

Ontotext’s Sumit Pal talks to TBWA about the current state of data management and how people are addressing these data management issues.

What is Data Mesh?

Ontotext’s Sumit Pal talks to Big Data Quarterly about the “why” and “what” of data mesh and the role knowledge graphs play when considering adopting a data mesh strategy

Ontotext – Onto The Cloud

Read about Ontotext’s offering of GraphDB on AWS Marketplace and why enterprises should leverage its capabilities for building their semantic applications