It Takes Two to Tango: Knowledge Graphs and Text Analysis

Knowledge Graphs Help Text Analysis, Which Helps Knowledge Graphs, Which Help Text Analysis, Which Helps...

November 5, 2019 5 mins. read Jarred McGinnis

Ontotext Platform synergizes knowledge graphs and text analysis as follows:

  • Knowledge graphs can improve text analysis performance. Big knowledge graphs provide rich semantic profiles of all the popular concepts in a given domain and allow those to be more accurately recognized in text.
  • Text analysis can extract new concepts and relationships, which can be added to enrich a knowledge graph with information not available from structure sources.

Call it symbiosis, a virtuous cycle or just good engineering, the Ontotext advantage is coupling two technologies (text analysis and knowledge graphs) that complement each other to better solve today’s content challenges. Text analysis helps machines parse and organise the messiness inherent in human language. Knowledge graphs reduce the semantic gaps between a human understanding of information and the computer’s structuring of that information.

New call-to-action

 

Knowledge Graphs Help Text Analysis, Which Helps…

Machines need consistency, constancy and unambiguity. They are built up from the clear distinct states of on-off, one-zero, true-false. Human language, on the other hand, has emerged from millennia of history, geography, culture and happenstance. The meaning of words is often ambiguous, variable and context-dependent. Humans use language to communicate but also to obfuscate meaning. The endless expressivity of our language is why it is such a powerful tool for us and the reason it is so hard for machines to replicate our language abilities

A full explanation of text analysis can be found in our fundamental. Ultimately the goal of text analysis is to bring some tidiness to the messiness of language. The analysis includes tasks from separating parts of speech (e.g. nouns, verbs, adjectives, etc.) to identifying concepts and entities such as people, organisations, place names, chemical compounds and products as well as relationships between them.

Identifying the entities, often the proper nouns, is an extremely useful shortcut to identifying what content is about. Humans, too, are particularly focused on proper nouns for understanding. Text analysis typically uses a list, a gazetteer, as a first step to identify and categorise these concepts. Making such a list is a relatively straightforward, if time-consuming, process. It is the maintenance of those lists that becomes unwieldy against that mercurial disorder of language. Languages are not static. The topics we discuss using language are also not static. Words appear and disappear. The meanings of words slip and slide constantly (see autoantonym).

One of the ways the Ontotext Platform solves this problem is by utilising knowledge graphs to create and maintain the gazetteer list. Harnessing the global and encyclopaedic information of the Linked Open Data (LOD), the platform provides a way for machines to organise text in a humanly-useful way. Search and analytics are improved immediately. Not only can the machine identify, also called tagging, entities and the relationships amongst these entities within a single document, it does so by imposing a universal structure across the organisation to enable the machine, and human-users, to understand the entities and their relationships across all documents. This normalisation of data becomes language and system independent. Whether the text refers to the capital of France as Paris, Париж or 巴黎, the computer knows the content or data is referring to the same entity. Additionally, the Ontotext Platform makes use of an organisation’s internal data to tag entities specific to the organisation and terms used within its industry.

The knowledge graph also improves disambiguation. ‘Paris’ could be a person’s first name as well as a city in France or Texas, USA. By utilising the entities identified in the surrounding text and the graph (for more details on graphs), the knowledge base enables the text analysis to have more certainty in identifying the correct entity. For a specific example of how the knowledge graph provides context and concept awareness, refer to the webinar Graph Analytics on Company Data and News.

Text Analysis, Which Helps Knowledge Graphs, Which Help…

The knowledge graph suggests entities that are currently known. Inevitably, new people, companies, products appear in content, and these are most likely currently of most interest. The Ontotext Platform’s text analysis is also able to tag candidate entities that aren’t currently in the gazetteer. By making use of the entities also identified in the text, text analysis is able to suggest new entities and their type (e.g. person, location, company, brand, etc).

Depending on its certainty, it can automatically or semi-automatically add the newly identified entities and infer new relationships. Take the following sentence as an example: ‘Ann Sarnoff has been appointed the new CEO of Warner Bros.’ If the knowledge graph isn’t aware of ‘Ann Sarnoff’ or even the company ‘Warner Bros.’, the Ontotext Platform will be able with a high degree of certainty to add a surprising amount of new information to the knowledge graph. 

  1. Ann Sarnoff is a person.
  2. She has a job title of ‘CEO’.
  3. ‘Warner Bros.’ is a company.
  4. ‘Warner Bros.’ has a position of ‘CEO’.
  5. Ann Sarnoff is an employee of ‘Warner Bros.’

That is a lot of information extracted from one simple sentence. In practice, the text analysis is processing of sentences and enriching the knowledge base with thousands and millions of new pieces of information. This new information is added to the knowledge graph, making it richer, more exhaustive and more likely to identify new entities and relationships. For a more detailed and technical explanation of how semantic tagging is modelled in the Ontotext Platform, read Ontotext Platform: A Global View Across Knowledge Graphs and Content Annotations.

New call-to-action

Article's content

Jarred McGinnis is a managing consultant in Semantic Technologies. Previously he was the Head of Research, Semantic Technologies, at the Press Association, investigating the role of technologies such as natural language processing and Linked Data in the news industry. Dr. McGinnis received his PhD in Informatics from the University of Edinburgh in 2006.

Human-computer Collaboration with Text Analysis for Content Management

Read about how knowledge-driven computing such as Ontotext’s content management solutions are essential for closing the semantic gap between humans and computers.

RDF-Star: Metadata Complexity Simplified

Read about how RDF-Star brings the simplicity and usability of property graphs without sacrificing the essential semantics that enables correct interpretation and diligent management of the data.

Knowledge Graphs for Open Science

Read about how knowledge graphs model the relationships within scientific data in an open and machine-understandable format for better science

Knowledge Graphs and Healthcare

Read about how industry leaders are using Ontotext knowledge graph technology to discover new treatments and test hypotheses.

Does Your Right Hand Know That Your Left Hand Just Lost You a Billion Dollars?

Read about how by automatically identifying and managing human, software and hardware related outages and exposures, Ontotext’s smart connected inventory solution allows banks to save much time and expenses.

Data Virtualization: From Graphs to Tables and Back

Read about how GraphDB’s data virtualization allows you to connect your data with the knowledge graph regardless of where that data lives on the internet or what format it happens to be in.

Throwing Your Data Into the Ocean

Read about how knowledge graphs help data preparation for analysis tasks and enables contextual awareness and smart search of data by virtue of formal semantics.

Ontotext Invents the Universe So You Don’t Need To

Read about the newest version of Ontotext Platform and how it brings the power of knowledge graphs to everyone to solve today’s complex business needs..

From Data Silos to Data Fabric with Knowledge Graphs

Read about the significant advantages that knowledge graphs can offer the data architect trying to bring a Data Fabric to their organization.

What Does 2000 Year Old Concrete Have to Do with Knowledge Graphs?

Read about how knowledge graphs provide a ‘human-centric’ solution to preserving institutional memory and avoiding operational mistakes and missed business opportunities.

Three’s Company Too: Metadata, Data and Text Analysis

Read about how metadata grew more expressive as user needs grew more complex and how text analysis made it possible to get metadata from our information and data.

The New Improved and Open GraphDB

Read about Ontotext’s GraphDB Version 9.0 and its most exciting new feature – open-sourcing the Workbench and the API Plugins.

It Takes Two to Tango: Knowledge Graphs and Text Analysis

Read about how Ontotext couples text analysis and knowledge graphs to better solve today’s content challenges.

Artificial Intelligence and the Knowledge Graph

Read about how knowledge graphs such as Ontotext’s GraphDB provide the context that enables many Artificial Intelligence applications.

Semantic Search or Knowing Your Customers So Well, You Can Finish Their Sentences For Them

Read about the benefits of semantic search and how it can determine the intent, concepts, meaning and context of the words for a search.

The Knowledge Graph and the Internet’s Memory Palace

Learn about the knowledge graph and how it tells you what it knows, how it knows it and why.

The Web as a CMS: How BBC joined Linked Open Data

Learn what convinced the skeptics on the editorial side of the BBC to try the simple but radical idea of ‘The Web as a CMS’.

Can Semantics be the Peacemaker between ECM and DAM?

Learn about how semantics (content metadata) can give peace a chance and resemble how humans understand and use the content.

The Future is NOW: Dynamic Semantic Publishing

Learn how semantically annotated texts enhance the delivery of content online with Ontotext’s News On the Web (NOW) demo.

Introducing NOW – Live Semantic Showcase by Ontotext

Discover interesting news, aggregated from various sources with Ontotext’s NOW and enjoy their enriched content with semantic annotation.