What is Data Fabric?

Data fabric is the new addition to the growing alphabet soup of new IT vocabulary terms. This post discusses what, why, when and how of data fabric

Data fabric is a combination of architecture and technology, which makes it easier for organizations to manage heterogeneous and diverse data. Its logical data architecture is designed to help with the growing data volumes spanning across silos. It provides seamless data connectivity, delivering value and insights through a semantic knowledge layer. By leveraging technologies such as metadata, machine learning, and automation, it provides a holistic view of enterprise data across various data formats and locations. Data fabric also enables data federation and virtualization to offer unified access in a distributed data environment that captures and contextually connects data across business domains. 

Think of data fabric as a queryable layer that allows access to data and information across data silos. This layer is implemented over existing repositories, data centers, cloud providers, and edge. As a result, end users can query the data, irrespective of where it’s located or what format it’s in. Data fabric also facilitates thinking of data as an enterprise asset rather than in terms of what the current schema is or what query language to use. This ensures data assets in an organization can be accessed, combined, and governed more efficiently and effectively, increasing the added value of these assets.

Why Data Fabric?

As organizations’ data stack grows in complexity, the challenges facing data teams increase exponentially. Hunting for treasure (actionable insights) across diverse disparate data sources is mind-bogglingly complex without a data map to guide data professionals. The holistic goal of data fabric is to provide data teams with a uniform access layer across these data sets. 

Compared to a centralized consolidated data approach to data architecture, a data fabric oriented approach considerably reduces the time and effort required by data teams to build end-to-end platforms. Contextually connecting data across business units according to its business meaning keeps data in situ while providing uniform access to it.

Data fabric helps organizations find and reuse data spanning across environments (from on-premises to cloud), which also improves its usability and the quality of key data assets. The metadata layer, the core foundation of a data fabric, enhances data intelligence in the ecosystem by recognizing different types of data, what data is relevant, and what data needs privacy controls and governance.

What Are the Business Benefits of Using Data Fabric?

By creating a unified, consistent data access layer across organizational data, data fabric empowers both operational and analytic use cases. It enables a virtualized, integrated, metadata-driven approach to data management. It also creates a connected enterprise with a knowledge layer to power AI and analytic-driven applications.

The data fabric oriented approach brings in a wide-ranging set of business benefits:

  • Provides a single point of access to discover, maintain, and consume data with context. 
  • Automates governance, policy enforcement and compliance, and improves data quality with the metadata on which it is built.
  • Facilitates discovery of private, critical data elements in an automated way with the supporting metadata.
  • Enables self-service with collaboration and makes data consumable in a unified way.
  • Automates data discovery, classification, and data curation processes with ML, AI, and NLP techniques leading to faster time-to-value.

How Does Data Fabric Work?

Data fabric is not something that can be bought as one complete tool. It is powered by a layer of software over existing systems, composed of several services through which data consumers access data.

Some of the functional components of a data fabric are shown in the following diagram:

A high-level architecture of data fabric with its components.

The resulting data fabric leverages rules to automatically map and link policies to data assets by using classifications, business vocabularies, and taxonomies. Its implementations vary across organizations with some choosing to implement a fabric like architecture using a variety of technologies and products. Watch out for vendors who embellish data catalogs and sell them with the data fabric moniker. 

All in all, data fabric solutions rely on universal data representation for efficient and contextual search. To be done properly, they should use knowledge graphs for incorporating semantics, context, and ML algorithms, which automate discovery and cataloging. These solutions support different data integration patterns by leveraging metadata to simplify and augment integration challenges. 

When to Use Data Fabric?

Although the data fabric approach helps organizations generate actionable insights out of the ever-increasing volumes of data, disconnected across silos, it’s not a one size fits all solution.

Data fabric solutions are best suited for large organizations with the following requirements:

  • Organizations with a rapidly growing data footprint, across myriad data sources and data formats, stored across multiple geophysical locations, that need to democratize access to this data.
  • Organizations that have highly interrelated data and experience challenges to unify data from different business units and departments.
  • Organizations where the lack of business and domain context and unified semantics hinder the appropriate usage of their data. 

Conclusion

Siloed data islands lead to siloed thinking and today’s data is generated, stored, and used across data centers, edge, and cloud providers. 

Incorporating and building data fabric within an enterprise is a journey. It does not happen overnight nor is it something that is available as an off-the-shelf pre-packaged tool. Also, it does not replace data warehouses, data lakes, or lakehouses. Instead, it makes them more accessible by aggregating data from heterogeneous data sources. It accomplishes that by providing a virtualization layer that assimilates data with zero copy, ensuring privacy and regulatory compliance. 

Although data fabric has not become mainstream, organizations are increasingly adopting pieces of this approach when building data solutions. It’s an evolution of enterprise data architecture addressing the two most challenging aspects of data management – getting a handle on data across data silos and semantically integrating that data. 

Want to learn more about data fabric?

Check out our presentation: How Knowledge Graphs power Data Mesh and Data Fabric

[schemaapprating]

Ontotext Newsletter