Learn about the linked data pathways to wisdom through ‘who’, ‘what’, ‘when’, ‘where’, ‘why’, ‘how to’ and, finally, ‘what is best’.
Call it the lifeblood of our digital systems, the new oil or the shadow of our everyday activities, data is one of the most significant resources of the present and the future. Accessing, managing and integrating this resource across platforms is key to building robust solutions. Successfully integrated and managed, commercial data is also one of the enablers of industrial digitalization, economic growth, innovation and public services.
To serve as such an enabler, data needs to be trustworthy, managed and exchanged smoothly between various systems and, last but not least, easily accessed. It is towards this end that data spaces across industries and entire sectors are being built. For example, the European Union is investing heavily in data spaces and related legislation initiatives (e.g., The EU data strategy) and commercial incentives to facilitate the sharing of data.
Almost a decade ago, the notion of data spaces, as a concept for data integration in a decentralized manner, appeared in the article Data Spaces not Databases. Introduced as a “new agenda for data management”, the concept of data spaces very much differs from centralized data integration approaches. Data spaces are such that they allow data to be stored at the source, while remaining available to be accessed, managed and exchanged at a semantic level.
Years after the term has been introduced and a dozen of associations and initiatives for applying this concept in industries later, data spaces are now seen as a mechanism for supporting data sharing and data sovereignty in ecosystems.
Data spaces are now seen as a mechanism for supporting data sharing and data sovereignty in ecosystems. Click To TweetIn more detail, data spaces are the total set of interoperable data-sharing mechanisms by organizations, individuals and groups in a certain domain or a sector.
In terms of societal impact, such interoperable data allows an unobstructed data flow and is the solid basis for gaining access to information across industries and company data, while at the same time preserving sovereignty. As Lars Nagel, CEO of the International Data Spaces Association, framed it in his write up The Magic of Data Spaces Now, data spaces provide greater interoperability than traditional data sharing and offer shared services to their participants.
Different countries approach data spaces differently. For example, in the US, data spaces are left to the private sector. By contrast, in China there is a combination of government surveillance with a strong control of Big Tech companies over massive amounts of data without sufficient safeguards for individuals.
In the context of the European Union, data spaces are a huge initiative for industrial data sharing. This has the potential to enable efficient commercial data exchange and smoother industry digitalization. It can happen by legislation for data access, standardization for data exchange and initiatives for building better tools for data management, data integration and harmonization.
Common European data spaces are part of the European strategy for data and aim to ensure that more data becomes available for use in the economy and society. Currently, a total of nine common data spaces are planned, namely: industrial (manufacturing) data space, Green Deal data space, mobility data space, health data space, financial data space, energy data space, agriculture data space, public administration data space, and skills data space. Additional industry initiatives build data spaces in automotive (Catena-X), transport and logistics, etc.
While all these spaces serve different goals and are to be built with different approaches, what unites them is the univocal need for secure data exchange, data sovereignty and data interoperability. Providing more data for building data spaces, demands efficient data access across systems, management and exchange together with standards that ensure data interoperability.
For that matter, there are a number of associations that bring forward the idea and practice of standard data exchange across industries and companies. Among them are:
And although more and more initiatives work towards better standardization of data across sectors and domains, few if any data spaces use semantics to represent data itself. Yet, this is where true data sovereignty, data interoperability and efficient, real-time data sharing and exchange practices lie.
Few if any data spaces use semantics to represent data itself. Yet, this is where true data sovereignty, data interoperability and efficient, real-time data sharing and exchange practices lie. Click To TweetData in data spaces can be much more usable and valuable if provided in machine queryable manner at the level of each data piece, rather than at the level of the data space or dataset itself.
Both data spaces and semantic repositories aim at combining data from different sources to enable the integration of heterogeneous data, stored across different systems. One of the common problems they try to solve is how to enable more efficient data sharing between systems. The way data spaces and systems built with semantic technologies differ is the level of granularity at which semantics is applied.
In a way, data spaces in the EU are already semantic. All of them heavily use semantic technologies in describing essential components of data sharing: datasets and related metadata, licenses, participants (data providers and data consumers), users, access rights, use and commercial agreements, etc. But although this is a big step towards a more connected data economy, there’s still a lot of work ahead. Data spaces are to be thought and practiced through understanding and applying Linked Data principles.
To serve their goal of being a space for interoperable, trusted and easy to access public and commercial data, data spaces are to move from being a centralized place where organizations transfer their data towards spaces that allow data sharing through data distribution and federation.
In their current state, with a limited use of Linked Data, data spaces face several obstacles before reaching their goal of efficient, secure and interoperable data exchange, namely:
With Linked Data factored in the design of data spaces, the scenario significantly changes. Designing a data space with semantic interoperability in mind, leads to more connected data and better utilized information across sectors. Linked Data principles can offer significant benefits in terms of efficiency of data provisioning and use, timeliness and locality of information.
By using Linked Data principles, European data spaces can reap significant benefits. In addition to more efficient data sharing and use, this can include improvements to machine learning and data science processes since Linked Data principles offer on-demand up-to-date access to data at its origin.
When it comes to the technological enablers of semantic data spaces there are many ways in which more semantics can be added to their design and building. Two of them are closely related to bridging various formalizations, models and storages and these are Polyglot modeling approaches and Hybrid storage technologies. We will talk in more detail about both of them in an upcoming post about how exactly data spaces can become semantic.
To better exploit the potential of data as the key resource of the future, it needs to be harmonized at the level of the data model, not just the data space. Moving forward from legacy to semantics, we need to systematically semanticize and reuse data standards. Linked Data principles provide a straightforward approach to better data distribution and technical sovereignty.
In 2021, during his presentation at ENDORSE 2021, Fabien Gandon, Research Director in Informatics and Computer Science at Inria, noted that web open standards for Linked Data and knowledge graphs are key enablers of EU digital sovereignty. The same goes for data spaces – built with Linked Data at the level of the data model, data spaces would be truly sovereign and serve:
It is only natural, and now easier than ever before, to build these three pillars of data spaces out of Linked Data blocks: secure data exchange, data sovereignty and data interoperability. Built that way, they can further become the future-proof pillars for advancing data spaces as enablers of convergence and harmonization of data across industries, countries and enterprises.
Do you want to learn more about the linked data principles?