• Blog
  • Informational

Human-computer Collaboration with Text Analysis for Content Management

Computers have programmed us, not the other way around.

June 29, 2021 6 mins. read Jarred McGinnis

The history of computing has largely been a history of humans forced to follow the whims of our machines. No humans used binary until Leibnitz. Even Leibnitz didn’t find much use for it. There were attempts at decimal computers in order to be closer to the numeral system used by humans but they didn’t last. We are forced to use binary computers, because computers use electricity and voltages can vary. To make more reliable computers, they were designed with two states, with (aka 1) or without (aka 0) voltage. We’ve been bound to communicate with computers in the rather ugly and verbose manner of zero and ones ever since.

I don’t think anyone on the Apollo missions would have chosen punch cards, a technology originally used for programming Victorian industrial looms, to take man to the moon fifty years ago, but that’s what they had to do. The progress of human-computer interactions is the slow and steady movement toward higher levels of abstraction and less bound by the underlying implementation. The keyboard enabled users to type in letters of a word in a human language rather than raw binary. The mouse turned our screens into a metaphor for a two dimensional plane. Humans were required to make the conceptual shift to understand that a move on the x-axis of our desks meant the cursor would move on the z-axis on our screens. Humans are clever like that. We are adaptable where machines are rigid.

Bits and Bytes into Knowledge and the Virtuous Cycle

As the physical interactions between humans and computers move toward the more human-centric and intuitive models such as gesture computing, augmented reality and embedded computers, an important gap remains between man and machine, understanding. Knowledge-driven human-computer interaction is closing that gap with ontologies, formal definitions of knowledge and inference. Knowledge graphs have become essential to getting machines to understand the needs of humans.

Text Analysis for Content Management solutions are at the forefront of knowledge-driven computing. By using a knowledge graph database like GraphDB and natural language processing (NLP), content becomes connected, dynamic, meaningful and contextual. This enables the automation of knowledge tasks as well as empowering analytical tools for human experts to discover insights and make decisions.

There are a number of approaches Ontotext uses to transform documents and data of all flavours (e.g. structured and unstructured) into a format that is accessible as interconnected knowledge. The solution used will always depend on the use case and the data of the client. Ontotext has years of experience working with clients to tailor a solution that best addresses the organization’s needs. This has resulted in a proven methodology and tools to deliver these solutions.

Text Analysis

Our content is understandable to us humans because we understand concepts like words, parts of speech and even the meaning of the blank space between a set of letters. It is still all zeros and ones to the machine. Text analysis is a generic term for various processes that order and structure the vagaries of human language into a format interpretable to computers. Ontotext’s unique offering is to integrate a pragmatic approach to text-analysis with GraphDB’s inference to close the gap between our language and the computers.

Often the volume of content and specific data mining challenges requires Ontotext to employ machine learning techniques. This process often, but not always, involves human subject matter experts going through samples of documents and performing the same task that the machine would be expected to perform. This acts as the gold standard by which the machine learning algorithm is trained. It’s a highly intensive manual process but can result in high levels of accuracy and precision.

The distinguishing feature of Ontotext is that these NLP processes are integrated with a knowledge graph. The extracted entities and their relationships are put into the database providing a formal definition of those entities and their relationships, which can be reasoned about. The computer not only knows ‘Joe Biden’ is a thing called a ‘Person’ but that he has a relationship to another entity called ‘The United States’. By virtue of being defined as a person, the database reasons additional information about ‘Joe Biden’ not explicitly stated in the source content. The beauty of this approach is that the newly extracted entities and their relationships are added to the knowledge graph, which in turn increases the performance of NLP on subsequent documents.

Content Management

To a machine our most valuable assets and most important documents are a gray goo of bytes. And yet, to companies that grey goo is black crude in the ground. Companies can and do see the worth in their content and data, but it requires the collection, refinement, management and delivery to extract that value. Depending on the requirements of the business case, the content management solutions that Ontotext delivers can be categorized into a number of tasks:

  • Document Classification – categorizing your documents against a standard or your own taxonomy;
  • Named Entity Recognition – extracting different classes of entities such as people, places, and companies;
  • Relationship Extraction – discovering relations between entities such as ‘works for’, ‘located in’, ‘employees of’.
  • Recommendation Services – providing contextual or behavioural recommendations;
  • Semantic Search – providing faceted, free-text and knowledge-based search capabilities.

All these solutions are tackling the semantic gap between the human and computer views of content. To stay with the crude oil metaphor, if text analysis is the extraction then content management is the refinement and the pipelines delivering the product. Solutions like document classification ensure that the right content gets to the right audiences. Academic publishers often work across a broad range of disciplines and the ability for the computer to automatically understand the differences between papers on ‘Yukawa Coupling’ and ‘Coupled Map Lattices’ is valuable for the user experience and satisfaction.

Name entity recognition and relationship extraction have been shown to be vital for intelligence tools for publishers in the finance industry due to the entity-richness of their content and the monetary value to their subscribers to identify relationships before others. Recommendation services and semantic search are inarguably increasingly valuable as the exponential growth of content continues but at any given moment, we are searching for ‘that one thing’ we need.

Human-Computer Collaboration

As the means of human-computer Interaction becomes closer and closer to the way humans actually interact with each other, this will necessitate the machines understanding the world that concerns them. The physical interactions are becoming closer and maybe even too close.

Knowledge-driven computing such as Ontotext’s content management solutions will be essential to closing the semantic gap between us. It will no longer be us teaching ourselves the idiosyncrasies of computers or interpreting outputs with respect to the real world. With semantic technology we will move toward human-computer collaboration rather than just interaction.

Do you want to learn more about Ontotext’s content management solutions?

White Paper: Text Analysis for Content Management
Learn how we can make your content serve you better!

New call-to-action

Article's content

Technical Author at Freelancer

Jarred McGinnis is a managing consultant in Semantic Technologies. Previously he was the Head of Research, Semantic Technologies, at the Press Association, investigating the role of technologies such as natural language processing and Linked Data in the news industry. Dr. McGinnis received his PhD in Informatics from the University of Edinburgh in 2006.

Human-computer Collaboration with Text Analysis for Content Management

Read about how knowledge-driven computing such as Ontotext’s content management solutions are essential for closing the semantic gap between humans and computers.

RDF-Star: Metadata Complexity Simplified

Read about how RDF-Star brings the simplicity and usability of property graphs without sacrificing the essential semantics that enables correct interpretation and diligent management of the data.

Knowledge Graphs for Open Science

Read about how knowledge graphs model the relationships within scientific data in an open and machine-understandable format for better science

Knowledge Graphs and Healthcare

Read about how industry leaders are using Ontotext knowledge graph technology to discover new treatments and test hypotheses.

Does Your Right Hand Know That Your Left Hand Just Lost You a Billion Dollars?

Read about how by automatically identifying and managing human, software and hardware related outages and exposures, Ontotext’s smart connected inventory solution allows banks to save much time and expenses.

Data Virtualization: From Graphs to Tables and Back

Read about how GraphDB’s data virtualization allows you to connect your data with the knowledge graph regardless of where that data lives on the internet or what format it happens to be in.

Throwing Your Data Into the Ocean

Read about how knowledge graphs help data preparation for analysis tasks and enables contextual awareness and smart search of data by virtue of formal semantics.

Ontotext Invents the Universe So You Don’t Need To

Read about the newest version of Ontotext Platform and how it brings the power of knowledge graphs to everyone to solve today’s complex business needs..

From Data Silos to Data Fabric with Knowledge Graphs

Read about the significant advantages that knowledge graphs can offer the data architect trying to bring a Data Fabric to their organization.

What Does 2000 Year Old Concrete Have to Do with Knowledge Graphs?

Read about how knowledge graphs provide a ‘human-centric’ solution to preserving institutional memory and avoiding operational mistakes and missed business opportunities.

Three’s Company Too: Metadata, Data and Text Analysis

Read about how metadata grew more expressive as user needs grew more complex and how text analysis made it possible to get metadata from our information and data.

The New Improved and Open GraphDB

Read about Ontotext’s GraphDB Version 9.0 and its most exciting new feature – open-sourcing the Workbench and the API Plugins.

It Takes Two to Tango: Knowledge Graphs and Text Analysis

Read about how Ontotext couples text analysis and knowledge graphs to better solve today’s content challenges.

Artificial Intelligence and the Knowledge Graph

Read about how knowledge graphs such as Ontotext’s GraphDB provide the context that enables many Artificial Intelligence applications.

Semantic Search or Knowing Your Customers So Well, You Can Finish Their Sentences For Them

Read about the benefits of semantic search and how it can determine the intent, concepts, meaning and context of the words for a search.

The Knowledge Graph and the Internet’s Memory Palace

Learn about the knowledge graph and how it tells you what it knows, how it knows it and why.

The Web as a CMS: How BBC joined Linked Open Data

Learn what convinced the skeptics on the editorial side of the BBC to try the simple but radical idea of ‘The Web as a CMS’.

Can Semantics be the Peacemaker between ECM and DAM?

Learn about how semantics (content metadata) can give peace a chance and resemble how humans understand and use the content.

The Future is NOW: Dynamic Semantic Publishing

Learn how semantically annotated texts enhance the delivery of content online with Ontotext’s News On the Web (NOW) demo.

Introducing NOW – Live Semantic Showcase by Ontotext

Discover interesting news, aggregated from various sources with Ontotext’s NOW and enjoy their enriched content with semantic annotation.