Ontotext Platform: Semantic Annotation Quality Assurance & Inter-Annotator Agreement

Ontotext’s vision, technology and business are about making sense of text and data. Letting big knowledge graphs improve the accuracy of text analytics. Using text analytics to interlink and enrich knowledge graphs. Enabling better search, exploration, classification and recommendation across diverse information spaces. This series of blog posts provides technical insights into the Ontotext Platform and its design choices to process large volumes of unstructured content using very large knowledge graphs, ensuring excellent annotation quality with the most efficient management of data; domain knowledge, annotations and unstructured content.

January 18, 2019 9 mins. read Jem Rayfield

The quality of semantic annotations has a direct impact on the applications that use them. This blog post focuses on the platform’s Curation Tool and its role in improving annotation quality.

As discussed in my previous post, the Ontotext Platform is often required to process and reprocess millions of unstructured content items using the platform’s text analytics [TA] components.

New call-to-action

An unstructured content archive may need to be processed or re-processed to discover and add additional knowledge or train a machine learning model. Ontotext’s text analytics components in these scenarios may well create 10’s of billions of annotations that need to be processed, re-processed and stored quickly with little or indeed no impact to a live running knowledge graph.

Text analytics annotation metadata adds information to unstructured content at some level: a word or phrase, paragraph or section, an entire document, a polygon within a Scalable Vector Graphic or perhaps a time-code within a video. Annotations connect unstructured content to concepts and background knowledge stored within a GraphDB’s knowledge graph, an ontology or a gazetteer (text analytics dictionary).

The TA services represent machine suggestions using an extended version of the W3C Web Annotation Model [WA] JSON-LD.

Annotations capture the semantic fingerprint of unstructured content; the structured knowledge contained within fragments of unstructured content using URI references to a GraphDB knowledge graph. Annotations include quantitative attribution such as confidence or relevance to support comparison.

TA annotation suggestions are published as events to an event queue to allow processing to be performed asynchronously. Suggestion events are consumed and in some cases moderated by a team of curators. Curators moderate suggestions using the Ontotext Platform Curation Tool.

Curation moderation aims to improve annotation quality, vocabulary depth and breadth and ultimately text analytics precision and recall.

The Curation process has a small set of general principles:

  • Guidelines: the Curation Tool must allow an annotator to moderate and fix annotations in order that they align with subject matter expert (SME) annotation guidelines.
  • Isolation: curators must moderate annotations in isolation, with distinct users, profiles and annotation statistics.
  • Conflict: can only be resolved by an administrator (supervisor) who is an SME
  • Agreement: a configurable threshold should define inter-annotator agreement (IAA). Once met, an annotation must be automatically set as “Accepted”

Annotation concepts may represent single entities such as a Person, Location or Organization or a more complex relationship such as a “Persons Role within an Organization”, “A Company’s Merger with another Company”, etc.

Annotations may require curation when they are ambiguous or “fuzzy” or indeed when confidence score thresholds are too low (configurable). There may also be word sense disagreement between isolated curators.

It is important that ambiguous edge cases are fixed and rationalized to keep annotation precision as high as possible.

When dealing with sentiment annotations the following text could be annotated with “sad” or perhaps “repulse” sentiments. It is also possible that annotation guidelines deem “repulse” as an inappropriate sentiment when associated to text covering sensitive subjects. It would, therefore, be likely that “repulse” needs to be moderated and removed. The Curation Tool would allow a curator to modify “fuzzy” text analytics suggestions to ensure quality is kept high.

"An Indian woman was allegedly set on fire and killed by her husband’s family because she was too dark-skinned and they wanted him to remarry a fairer bride."

Text fragments can also belong to multiple concepts or categorizations simultaneously. Overlapping annotations increase flexibility and allow annotations to capture the most knowledge. Curators must be able to visualize the different overlaps in order that each annotation can be moderated and reviewed carefully.

For example, the following text includes the Federal Court of Australia (the Organization) and Australia (the Location).

....."Access to this website has been disabled by an order of the Federal Court of Australia because it infringes or facilitates the infringement of copyright," Telstra's landing message reads......

The Curation Tool supports the moderation of these overlapping suggested annotations. It’s possible to moderate the annotation referencing the Organization Federal Court of Australia

and the Location referencing Australia.

The system supports a team of curators who moderate a stream of annotated content to eliminate annotation errors. Curators are usually SMEs working to an agreed annotation guideline to support data quality service level agreements.

The following screenshot depicts a stream of unstructured content items ready to be pulled from the curation queue. These content items have been annotated by the “machine” and are selected and ready for the human curation process.


The Curation Tool allows curators to accept and or reject novel concepts but also allow curators to add missing annotations or remove incorrect annotations.

The following Curation Tool screenshot depicts how a particular piece of text – “Amazon” – includes multiple suggested overlapping candidate annotations. In this particular case, the candidate annotation is set as the Organization Amazon.com, which references a GraphDB knowledge graph entity by its URI. The Curation Tool suggests alternative annotations (ordered by relevance/confidence); in this case, four “Amazon” Locations, six “Amazon” Organizations and two “Amazon” People annotations.

Curators are able to moderate a candidate annotation by replacing it. In this example, the Organization Amazon.com annotation is replaced by one of the other suggested annotations such as the Person Amanda Knox. Curators are also able to select a different candidate from the supplied list (classified by type) or by searching for a missing candidate across the entire knowledge graph:

When independent curators reach a level of agreement (inter-annotator agreement), annotations are automatically accepted or rejected. If curators disagree (detected by configurable consensus rules), an administrator/supervisor can override the team’s decisions and ensure that annotations are confirmed as accepted or rejected. Disagreements can occur due to word sense ambiguity or perhaps the characteristics of the curator. Curators may have different levels of familiarity with the material, amount of training, motivation, interest or fatigue. An administrator/supervisor must be an SME with a full understanding of the annotation guidelines and domain in order that administrator/supervisor overrides are likely to be of the highest quality.


The following Curation Tool screenshot depicts how an administrator can override conflicting curator annotations. In this particular example, removing the Person Amanda Knox, which conflicts with the Organization Amazon.com annotation.

Annotation conflict resolution is normally only required for a small number of edge cases but is an important catch for exceptions.

Automatically accepted, refined and moderated annotations are fed back into the “machine” to improve knowledge graph and text analytics vocabularies, corpora and statistical models. The following diagram describes this cyclic continuous improvement flow:


The cycle continuously adapts machine learned statistical models to facilitate improved F1 scores.

Ontotext’s text analytics platform not only discovers known entities within a knowledge graph but it is also able to detect and classify novel unknown entities and relationships. Novel entities are published and fed back into the GraphDB knowledge graph. When discovered by the platform’s text analytics components, they can subsequently go through the moderation process. The Curation Tool detects that the entities are not present within the knowledge graph and allows curators to accept and augment the knowledge graph vocabulary. Ontotext’s text analytics components are continuously updated with new entities. The text analytics architecture includes a configurable Dynamic Gazetteer that allows domain vocabularies (including URIs) to be synchronized in near real-time from the knowledge graph. Synchronization ensures that new entities that are included within the knowledge graph are also present and detected within unstructured content processed by the text analytics components.

Quality Metrics

As I stated at beginning of this post, the quality of Semantic Annotations has a direct impact on the applications that use them. For example, machine learning algorithms will learn how to make mistakes if the models are trained on a poor quality golden corpus that includes ambiguity and errors. The linguistic analysis will be misled if annotations are incorrect and the text analytics results will be poor. Search and discovery applications that query annotations directly will return false positives if the quality is not maintained via Curation moderation.

The Ontotext Platform Curation Tool supports the analysis of several metrics:

  • F1 Report: measure text analytics precision and recall of know entity/relationship types over time. Supporting KPI and SLA alerts and monitoring.
  • Content Report: measure the numbers of unstructured content items managed and moderated by the text analytics and Curation systems.
  • Curator Changes: measure and analyze the number of annotation changes made by curators to track the level of moderation required over time.
  • Conflicts and resolutions: measure the number of IAA disagreements and administrator/supervisor interventions over time.


Curation removes text analytics wheat from the chaff, preserving quality information, removing noise to produce quality annotations that accurately capture the knowledge locked within unstructured content.

Ontotext’s Curation Tool increases text analytics transparency, accountability and provides key management metrics. One is able to gain insights into curator teams usage patterns. It is also possible to analyze how Ontotext text analytics platform performs, how the text analytics algorithms are performing and what steps are required to improve precision and recall.

Ontotext’s Curation Tool is integrated with Ontotext’s text analytics and knowledge graph instance management tools in such a way that the platform can automatically learn from user feedback. It also provides mechanisms to automate the addition of new concepts or edit existing concepts so that they’re suggested automatically.

The ability to take data, to be able to understand it, process it, moderate it, extract value from it, visualize and clean it ensures that the Ontotext Platform captures your business knowledge and value accurately.

New call-to-action

Article's content

Head Of Architecture at Allen & Overy

Jem is an experienced software practitioner, architect, and director of development. He has proven himself as one of the best semantic technology solution architects previously working at the BBC and the FT. As Chief Solution Architect, he is helping Ontotext to deliver a comprehensive analytics and publishing platform.

Declarative Knowledge Graph APIs

Stop wasting time, manually building data access code. Let the Ontotext platform auto-generate a fast, flexible, and scalable GraphQL API over your RDF knowledge graph.

Star Wars: Knowledge Graph Federation

Read how you can use Ontotext Platform’s GraphQL federation capabilities to provide a unified interface for querying all of your data sources in context and allow clients to fetch data from any number of data sources simultaneously, without needing to know which data comes from which source.

Return of the Jedi: Ontotext Platform Metamorphosis

Read how we armed the Ontotext Platform with new tools to make navigating through the Star Wars knowledge graph data even easier

A New Hope: The Rise of the Knowledge Graph

Read about how Ontotext Platform utilizes its potential to lower the entry barrier to knowledge graph data in an exploration of the Star Wars universe.

Ontotext Platform: A Global View Across Knowledge Graphs and Content Annotations

Jem Rayfield provides insights into the Ontotext Platform and how GraphDB’s MongoDB connector unifies the platform’s knowledge graph and annotation RDF stores.

Ontotext Platform: Semantic Annotation Quality Assurance & Inter-Annotator Agreement

Jem Rayfield, Chief Solution Architect at Ontotext, provides technical insights into the Ontotext Platform and in particular the role of its Curation Tool.

Ontotext Platform: Knowledge Quality via Efficient Annotation at Scale

Jem Rayfield, Chief Solution Architect at Ontotext, provides technical insights into the Ontotext Platform and its design choices.