• Blog
  • Informational

Knowledge Graphs: Breaking the Ice

This post talks about the nature and key characteristics of knowledge graphs. It also outlines the benefits of formal semantics and how modeling graphs in RDF can help us easily identify, disambiguate and interconnect information

December 8, 2023 9 mins. read Gergana Petkova

A graph is like a map that represents real-life objects and the relationships between them. While many of us use Google, Twitter, Alexa and Siri, likely most don’t know (or think about) that they are powered by knowledge graph technology. In these social network graphs, the objects are people and organizations, and the relationships are ‘follows’ or ‘friends’. 

The objects in knowledge graphs are called “entities” and can represent real things in the world, events, situations or even ideas. The descriptions of these entities have a specific structure and meaning (semantics). This allows both humans and machines to process them efficiently and unambiguously. These descriptions also reference other entities and their descriptions and in this way create a vast network of knowledge.

The three roles of a knowledge graph

A knowledge graph is a versatile way of organizing and using data. It can act as a database, a network and a knowledge base depending on how it’s designed and used. 

Like a database, knowledge graphs have schemas and users can apply complex structured queries to extract specific data needed. However, unlike relational databases, schema in a graph is flexible and it doesn’t need to be pre-defined.

The data in a knowledge graph can be represented as a collection of nodes and edges and can be analyzed like a network structure. This enables users to perform different graph algorithms, optimizations and traversal operations and transformations.

Because of the formal semantics attached to the data, knowledge graphs can act as a knowledge base. This enables humans and machines to easily interpret this data and derive new information.

Formal semantics

Formal semantics (usually defined by an ontology) establishes an agreement between the developers of a knowledge graph and its users with the context of the domain and the meaning of the data. Semantics utilizes a number of representation and modeling instruments to express and interpret the data of a knowledge graph.

A description of an entity usually includes its classification with respect to a class hierarchy. The idea is that each entity belongs to exactly one class (but can also be a superclass representing a higher-level concept or a subclass with a granular concept). For example, in domains like general news the most common classes are Person, Organization and Location. To continue along the hierarchy, both Person and Organization can have a superclass Agent, whereas Location usually has sub-classes like Country, City, etc. 

The relationship between entities, on the other hand, are usually expressed by relation types. These indicate the nature of the relationship such as friend, relative, competitor, etc. Relation types can also have formal definitions. For instance, parent-of can be defined as the inverse relation of child-of and both can be considered specific cases of the symmetric relation relative-of.

Entities can also be associated with categories that describe specific aspects of their semantics. For example, a book can simultaneously belong to “Books about Africa”, “Bestseller”, “Books by Italian authors”, “Books for kids”, etc. 

It’s also possible to include “human-friendly” free text descriptions in a knowledge graph. This helps further clarify the design intentions for an entity and offers additional context and details for enhanced search capabilities.

Knowledge graphs in RDF

One of the common graph data models is the Resource Description Framework (RDF). Developed and standardized by the World Wide Web Consortium (W3C), it provides a powerful and expressive framework for representing data and metadata.

RDF Basics

RDF is made of three-part structures called triples. An RDF triple consists of Subject, Predicate and Object. Each triple has a unique identifier known as the Uniform Resource Identifier (URI), which looks like a web page address. 

Let’s consider the following example triples:

subjectpredicateobject
:Wilma:hasSpouse:Fred
:Wilma:hasAge24

In the first triple, “Wilma hasSpouse Fred”, Wilma is the subject, hasSpouse is the predicate and Fred is the object. In the second triple, “Wilma hasAge 24”, Wilma is the subject, hasAge is the predicate and 24 is the object.

By connecting multiple triples together, we create an RDF graph. The following diagram illustrates the characters and relationships found in the Flintstones TV cartoon series. We see triples such as “PebbleFlintstone livesIn Bedrock” or “BamBamRubble livesIn Bedrock”. This tells us that the Flintstones and the Rubbles live in Bedrock and that Bedrock is part of Cobblestone County in Prehistoric America.

The other triples in the graph describe the relationships between the different characters (hasSpouse or hasChild) as well as their work association (worksFor). For example, we can see that Fred and Wilma are married, that they have a child Pebbles and that Fred works for the Rock Quarry company.

Labeled Property Graphs

Labeled Property Graphs (LPGs) are another graph data model that offers light-weight management of graph data. Its primary motivation is not centered around semantics, data exchange or publication, but is focused on efficient storage that enables quick querying and traversal of interconnected data. 

LPG technology doesn’t have standardized schema or modeling languages and query languages, nor does it provide formal semantics and interoperability specifications. This means that there are no established serialization formats for representing LPGs.  Because of this, there are no federation protocols for integrating data from multiple sources or other mechanisms to ensure seamless interaction and compatibility between different LPG implementations. 

So this model is most useful when data needs to be collected on-the-fly and analytics is done within the scope of a single project.

The role of RDF-star

While RDF allows statements to be made only about nodes in the graph, LPGs can attach descriptions or properties to both nodes and edges. This is a major difference between the two models. 

The introduction of the RDF-star extension resolves this gap, which now allows RDF to make statements about other statements. Now it’s possible to attach metadata to describe graph edges such as scores, weights, temporal aspects and provenance.

The benefits of using semantic knowledge graphs

Overall, knowledge graphs represented in RDF allow data to be easily integrated, interconnected, identified, disambiguated and reused. This is possible because of a combination of factors discussed below.

The expressivity of Semantic Web standards enables the fluent representation of diverse data and content types. This includes data schema, taxonomies, vocabularies, metadata of various kinds as well as reference and master data. 

Another important aspect of RDF knowledge graphs is their formal semantics. Thanks to the precisely defined meanings, both humans and machines can interpret the model and data unambiguously.

Performance is also a critical aspect of semantic knowledge graphs. As all RDF specifications have been exhaustively designed and proven in practical scenarios, users can efficiently manage knowledge graphs containing billions of facts and properties. 

In addition, there are various specifications available in the RDF ecosystem to facilitate the interoperability of data across different systems and applications. They cover different aspects of data serialization, access, management and federation. 

Finally, standardization plays an essential role in everything discussed so far. Through the W3C community process, all of these have been standardized, ensuring the fulfillment of all the requirements of various stakeholders.

To be or not to be a knowledge graph

So far, we’ve focused on the nature and characteristics of knowledge graphs. Now let’s talk about what is not a knowledge graph. 

Not every RDF graph is a knowledge graph

A graph-based representation of data is valuable, but there are many use cases when we don’t need to capture the semantic knowledge in the data. 

For example, when statistical data like GDP for different countries is represented in RDF, this is not a knowledge graph. Here, we don’t need to define the meaning of what countries are or what the ‘Gross Domestic Product’ of a country is. It’s enough just to have the string ‘China’ associated with the string ‘GDP’ and the number ‘18.1 trillion’. 

So, the essence of a knowledge graph lies in its connections and the underlying graph structure rather than the specific language used for representing the data.

Not every knowledge base is a knowledge graph

Knowledge bases that lack formal structure and semantics also don’t qualify as knowledge graphs. One such example is a Q&A knowledge base about a product. Another is an expert system with data organized in a non-graph format that uses automated inference (e.g., a set of ‘if-then’ rules) to facilitate analysis.

An essential characteristic of a knowledge graph is that entity descriptions should be interlinked with one another. Each entity definition should include references to other entities, forming the basis of the graph structure.

Knowledge graphs are not software

Knowledge graphs are powerful frameworks for organizing data and metadata, designed to meet specific criteria and purposes. They are not software. Instead, the data and associated metadata of one knowledge graph can be used and reused by various independent systems to enable diverse functionalities.

A variety of software applications leverage knowledge graphs. This includes databases for structured queries, automated reasoners for inferring new relationships, full-text search engines for efficient content searches, editing and curation tools for modifying and extending knowledge graphs and countless more.

To wrap it up

Knowledge graphs have the potential to redefine the way we organize, interpret and use our data. Their ability to interconnect diverse data sources and capture complex relationships enables them to gain deeper insights and make more informed decisions. It also unlocks new possibilities for collaboration and innovation. As advancements in this technology continue to evolve, we can expect them to become even more sophisticated and powerful, bringing greater value to enterprise data. 

Break the ice and get to know your first knowledge graph!

New call-to-action

Article's content

Content Manager at Ontotext

Gergana Petkova is a philologist and has more than 15 years of experience at Ontotext, working on technical documentation, Gold Standard corpus curation and preparing content about Semantic Technology and Ontotext's offerings.

Knowledge Graphs: Redefining Data Management for the Modern Enterprise

Read this post about some of the primary problems of today’s enterprise data management and how knowledge graphs can solve them

Knowledge Graphs: Breaking the Ice

Read about the nature and key characteristics of knowledge graphs. It also outlines the benefits of formal semantics and how modeling graphs in RDF can help us easily identify, disambiguate and interconnect information

GraphDB in Action: Navigating Knowledge About Living Spaces, Cyber-physical Environments and Skies 

Read about three inspiring GraphDB-powered use cases of connecting data in a meaningful way to enable smart buildings, interoperable design engineering and ontology-based air-traffic control

Your Knowledge Graph Journey In Three Simple Steps

A bird’s eye view on where to start in building a knowledge graph solution to help your business excel in a data-driven market

Data Management Made Easy: The Power of Data Fabrics and Knowledge Graphs

Read about the significance of data fabrics and knowledge graphs in modern data management to address the issue of complex, diverse and large-scale data ecosystems

GraphDB in Action: Powering State-of-the-Art Research

Read about how academia research projects use GraphDB to power innovative solutions to challenges in the fields of Accounting, Healthcare and Cultural Heritage

At Center Stage VIII: Ontotext and Enterprise Knowledge on the Role of Knowledge Graphs in Knowledge Management

Read about our partnership with Enterprise Knowledge and knowledge management as an essential business function and lessons learned from developing content recommenders using taxonomies and GraphDB.

At Center Stage VII: Ontotext and metaphacts on Creating Data Fabrics Built on FAIR Data

Read about our partnership with metaphacts and how one can use the metaphactory knowledge graph platform on top of GraphDB to gain value from their knowledge graph and accelerate their R&D.

At Center Stage VI: Ontotext and Semantic Web Company on Creating and Scaling Big Enterprise Knowledge Graphs

Read about our partnership with Semantic Web Company and how our technologies complement each other and bring even greater momentum to knowledge graph management.

At Center Stage V: Embedding Graphs in Enterprise Architectures via GraphQL, Federation and Kafka

Read about the mechanisms for building a big enterprise software architectures by embedding graphs via GraphQL, Federation and Kafka

Ontotext’s Perspective on an Energy Knowledge Graph

Read about how semantic technology can advance energy data exchange standards and what happens when some energy data is integrated in a knowledge graph.

At Center Stage IV: Ontotext Webinars About How GraphDB Levels the Field Between RDF and Property Graphs

Read about how GraphDB eliminates the main limitations of RDF vs LPG by enabling edge properties with RDF-star and key graph analytics within SPARQL queries with the Graph Path Search plug-in.

At Center Stage III: Ontotext Webinars About GraphDB’s Data Virtualization Journey from Graphs to Tables and Back

Read this second post in our new series of blog posts focusing on major Ontotext webinars and how they fit in the bigger picture of what we do

At Center Stage II: Ontotext Webinars About Reasoning with Big Knowledge Graphs and the Power of Cognitive Graph Analytics

Read this second post in our new series of blog posts focusing on major Ontotext webinars and how they fit in the bigger picture of what we do

At Center Stage I: Ontotext Webinars About Knowledge Graphs and Their Application in Data Management

Read the first post in our new series of blog posts focusing on major Ontotext’s webinars and how they fit in the bigger picture of what we do

The Gold Standard – The Key to Information Extraction and Data Quality Control

Read about how a human curated body of data is used in AI to train algorithms for search, extraction and classification, and to measure their accuracy

Study of the Holocaust: A Way Out of Data Confusion

Learn how a ML algorithm trained to replicate human decisions helped the EU-supported EHRI project on Holocaust research with record linking.