Read about the knowledge graph and about how many enterprises are already embracing the idea of benefiting from it.
The Semantic Web, both as a research field and a technology stack, is seeing mainstream industry interest, especially with the knowledge graph concept emerging as a pillar for data well and efficiently managed. But what exactly are we talking about when we talk about the Semantic Web? And what are the commercial implications of semantic technologies for enterprise data?
The Semantic Web started in the late 90’s as a fascinating vision for a web of data, which is easy to interpret by both humans and machines. One of its pillars are ontologies that represent explicit formal conceptual models, used to describe semantically both unstructured content and databases. While Semantic Web is often condemned for being too academic, two of its incarnations already enjoy massive adoption.
The first one is Schema.org: millions of web pages are tagged with semantic annotations to enable a much better web search experience. The second one is the Linked Open Data (LOD): a cloud of interlinked structured datasets published without centralized control across thousands of servers. Knowledge graphs (KG) came later, but quickly became a powerful driver for the adoption of Semantic Web standards and all species of semantic technology implementing them. KGs bring the Semantic Web paradigm to the enterprises, by introducing semantic metadata to drive data management and content management to new levels of efficiency and breaking silos to let them synergize with various forms of knowledge management. This way KGs help organizations smarten up proprietary information by using global knowledge as context for interpretation and source for enrichment.
In this post you will discover the aspects of the Semantic Web that are key to enterprise data, knowledge and content management. We will walk you through the Semantic Web roots, the debate about it and further take you beyond its academic and visionary aspect into the world of efficient enterprise data management with semantic technologies and knowledge graphs. Connecting the dots for you between concepts like RDF, semantic annotation, Linked Open Data, we will help you understand why the Semantic Web will always work and how knowledge-intensive domains and applications can benefit from its affordances.
In 1994 Tim Berners Lee described the Web as “a flat, boring world devoid of meaning” (Plenary Talk Geneva) for computers. To set the stage for a Web more meaningful to our machines, in 1998, a Semantic Web Road Map came into being. After the design of this “architectural plan untested by anything except thought experiments”, in 2001 the Semantic Web seized the public’s imagination with a seminal article featured in Scientific American called: The Semantic Web: A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities.
Today, 20 years after Tim Berners-Lee, James Hendler and Ora Lassila outlined a Semantic Web driven world where intelligent software agents automatically book flights and hotels and give us personalized answers, we can see Google transitioning from a search engine into a question-answering system and Alexa becoming a device that can book a flight for you. There are more than 80 million pages with semantic, machine interpretable metadata, according to the Schema.org standard. Take this restaurant, for example. Under the hood of its web content lie formalizations describing its address, opening hours, name and other details.
Schema.org and Linked Open Data are just two incarnations of the Semantic Web vision. Those are two massive information domains using W3C’s stack of the Semantic Web standards (RDF, SPARQL, OWL, etc.).
And still there is confusion around the Semantic Web.
What is it? How exactly is the Web semantic? What can it do and how are enterprise knowledge graphs related to it?
Let’s start with the first question.
Paradoxically, the Semantic Web, which aims to unambiguously describe things, people and concepts, is itself an ambiguous term.
For some, the Semantic Web is about intelligent agents browsing the Web and executing sophisticated tasks. For others, the concept boils down to smart and efficient data management (Pascal Hiltzer traced the triumphs and challenges of two decades of Semantic Web research and applications in a recent Review of the Semantic Web Field). For its critics, it is a pipe-dream, too academic to be realized, a never to come true vision (see the perspectives regarding the degree to which the original Semantic Web vision has been realised and the impact it can potentially have on the Web gathered by Aidan Hogan.) Authors Marshal and Shipman, for example, explored the many rhetorical, theoretical and pragmatic perspectives on the Semantic Web, discerning three major threads in them. The Semantic Web, their research showed, is seen as:
(1) a universal library, to be readily accessed and used by humans in a variety of information use contexts; (2) the backdrop for the work of computational agents completing sophisticated activities on behalf of their human counterparts; and (3) a method for federating particular knowledge bases and databases to perform anticipated tasks for humans and their agents.
17 years later, in the third edition of The Semantic Web for the Working Ontologist, authors Dean Allemang, Hedler and Gandon covered these same perspectives with one distilled explanation:
The Semantic Web faces the problem of distributed data head-on.
In more detail, they explained that just as the hypertext Web changed how we think about the availability of documents, the Semantic Web is a radical way of thinking about data. Its main idea is to support a distributed Web at the level of the data where organizations or individuals don’t just publish a human-readable presentation of information but a distributable, machine-readable description of the data.
We can see all of the above theoretical investigations in practice when it comes to what Tim Berners Lee said once when asked about the Semantic Web: a way to pull data and then pull other data and connect them to see how things fit in.
If you’ve used Google, you’ve used the cornucopia of Linked data across the Web, through Google’s Knowledge Graph (Google’s Knowledge Graph is reportedly supported by Freebase – the knowledge acquired by Google in 2010.) If you’ve enjoyed the efficiency of rich snippets, you’ve enjoyed the riches schema.org (based on RDF) brings to the world of search since 2011. If you’ve used Wikidata – the structured encyclopedia – you’ve been using a giant RDF knowledge graph, describing about 100 million topics with over 10 billion properties and relationships. That is also one of the sources from which Google’s Knowledge Graph is updated.
Taking a closer look at these applications, we see two main perspectives from which the Web is becoming increasingly semantic.
Ontotext was founded in 2000 with the Semantic Web in its genes and we had the chance to be part of the community of its pioneers. We can’t imagine looking at the Semantic Web as an artifact. We rather see it as a new paradigm that is revolutionizing enterprise data integration and knowledge discovery. Below, we outline the two directions in which we at Ontotext see and build the Semantic Web.
The two distinct threads interlacing in the current Semantic Web fabrics are the semantically annotated web pages with schema.org (structured data on top of the existing Web) and the Web of Data existing as Linked Open Data.
It is these two important types of data, which, taken together, implement the Semantic Web vision bringing forward innovative ways of tackling data management and data integration challenges. (Read about schema.org and LOD in our Knowledge Hub). In them, we can see context being the enabler of value creation – context built by bringing data pieces together, at web-scale. And that connectivity at data level is what makes the Semantic Web and the technologies related to it such a good solution to the challenges of knowledge management and data integration. Ultimately, for the latest reincarnation of the field: the knowledge graph.
Facing the need to manage and analyze information at a previously unforeseen level, organizations began searching for infrastructures that could handle the massivity of available data and provide the means to make sense of this data. In this “data + knowledge” era in the history of creating intelligent systems to integrate knowledge and data at large scale (as Juan Sequeda calls the period from 2000s till now in his “A Brief History of Knowledge Graphs”) knowledge graphs began to emerge as such infrastructures.
Еnabling semantic search, easier and deeper navigation across diverse data, knowledge graphs have become a business-critical element for many enterprises today. Among the largest knowledge graphs are those of Google, IBM, Amazon, Samsung, Ebay, Bloomberg, NY Times. Most of those first big knowledge graphs are used in web-to-consumer applications where a single graph serves a wide variety of clients based on non-proprietary information.
Enterprise knowledge graphs came as a second wave to serve a different purpose – they use ontologies to make explicit various conceptual models (schemas, taxonomies, vocabularies, etc.) used across different systems in the enterprise. Using the enterprise data management slang, knowledge graphs represent a premium sort of semantic reference data: a collection of interlinked descriptions of entities – objects, events or concepts (see our definition of knowledge graphs).
Providing a formal unified conceptual model, ontologies enable unified access to and correct interpretation of diverse information and greatly facilitate analytics, decision making and knowledge re-use. The most advanced enterprise knowledge graphs smarten up proprietary information by using global knowledge as context for interpretation and source for enrichment. Such knowledge graphs deliver not only “operational optimizations”, but help organizations combine their proprietary wisdom and information with rich domain knowledge and get a competitive advantage in dynamic environments. And while not all knowledge graphs (see Adoption of Knowledge Graphs, late 2019) are built the semantic modelling way, they all have benefited from the Semantic Web. This is either because they use RDF or because, to a different extent, they’ve used Linked Data to broaden the scope of what the graph “knows”.
The “know more thread” is central to understanding the rapid adoption of semantic technologies for knowledge graphs. This is because for a system to “know” more it is essential to have broader knowledge. Such broader knowledge is unattainable by any single organization. No one company on the planet can build the ultimate knowledge graph. As Amit Sheth explained, today’s knowledge graphs needed for Google Semantic Search or Amazon Alexa seem to be built, as a rule of thumb, with at least 10,000 pairs of hands.
It is exactly that ability to derive knowledge by interconnecting data that turns the building and maintaining of an enterprise knowledge graph into an activity of building a competitive advantage. The more connected the data (Linked Data), the more knowledge the enterprise knowledge graph is infused with. And it is that knowledge (enabled by the technologies of the Semantic Web, specifically by RDF) that gives a competitive advantage to the company building the graph.
The major added value of knowledge graphs is the paradigm for using ontologies, explicit formal conceptual models, to put together data scattered across different systems. The key characteristic is that ontologies can capture, integrate and operationalize knowledge across several disciplines and type of systems:
In any of its aspects, for us and our clients, the Semantic Web will always work, by incessantly providing the necessary technologies for granular, detailed and well-described semantic metadata. The richness of RDF is expressive enough to be able to put them together and work together. That’s the genius of RDF – as a way of presenting data and metadata, it’s a good fit for all these things at once. The RDF data model and the other standards in W3C’s Semantic Web stack (e.g., OWL and SPARQL) enable the use of these knowledge models as hubs for:
In a nutshell, summarizing the enabling features of these standards, it is global identifiers that facilitate interoperability, formal semantics that brings explicit common meaning and validation (SHACL/RDF Forms) that leads to high data quality.
Despite the widespread belief that knowledge graphs help enterprises find content easier and deal with heterogeneous data more efficiently (which they do), the biggest driver for the enterprises to build KG is to get better insights and competitive advantage, smartening up their proprietary information by using global knowledge as context for interpretation and source for enrichment.
In knowledge-intensive domains and applications, which require highly interconnected reference data and building complex relationships between them, knowledge graphs help enterprises get profound insights via linking, analysis and exploration of diverse databases, content, proprietary and global data.
And all that mesh of data, or data fabrics as you might see it referred to, wouldn’t have been possible if it weren’t for the affordances of the Semantic Web to connect data in knowledge graphs in order to derive value from it.
Are you ready to learn more about the reincarnation of the Semantic Web – the knowledge graph?
Listen to our webinar: Knowledge Graph Maps: 20+ Application and 30+ Capabilities, which focuses on enterprise knowledge graphs as hubs for data, metadata and content offering unified views to diverse information.