A big provider of health insurance in the US wanted to evaluate the capabilities of Ontotext’s leading RDF database GraphDB in creating a knowledge graph (KG) that would allow researchers to generate and evaluate novel hypotheses. For this purpose, they selected a clearly defined PoC use case: building a knowledge graph focused on Age-related Macular Degeneration (AMD). The AMD KG used a subset of PubMed, the Semantic MEDLINE Database (SemMedDB), and various biomedical ontologies and thesauri integrated semantically into Unified Medical Language System (UMLS).
AMD is a leading cause of irreversible blindness and visual impairment in the world and in the US alone, it affects about 11 million people. By creating a rich knowledge graph with information extracted from scientific articles, the health insurer wanted to test if this technology could enable the discovery of remote associations between AMD and different genetic markers, drugs, therapies, etc. Such insights would greatly facilitate the identification and verification of new hypotheses for treating the disease. The duration of the PoC project was 4 months.
The main challenge in creating an AMD KG was that all medical knowledge about the disease was scattered across the rapidly growing number of data and sources with genomic, molecular, and other biomedical data. The nature of the data coming from multiple sources was highly fragmented and contained a lot of semantic redundancy (ambiguity) and other discrepancies.
The existing systems used by the health insurer at the time were struggling to manually process the vast volumes of published research and extract relevant knowledge from the growing medical literature. As a result, they were pressed to deliver a comprehensive picture that would help researchers develop a better understanding of the disease mechanisms, generate novel hypotheses, and provide better care.
The KG-based data discovery solution provided by Ontotext enables the smooth integration of both structured data (about symptoms, known treatments, etc.) as well as metadata extracted from articles published in PubMed (a repository of medical literature maintained by the National Library of Medicine). Ontotext’s proven methodology for semantic data modeling normalizes both data schema and instances to concepts from major medical ontologies and vocabularies.
Thanks to GraphDB’s unique inference capabilities, users can derive new knowledge from existing facts, which adds extra explanatory power to their knowledge discovery. Applying logical rules on the data at scale turns the disjointed pieces of information coming from different sources into a richer network of knowledge.
The resulting high-quality AMD knowledge graph provides single-point access to complex data from multiple sources and enables the discovery of intricate correlations between the concepts described in this data. This approach creates comprehensive 360-degree views of all relevant biomedical concepts and helps researchers follow chains of connected facts, which could give them insight into a particular problem.
Ontotext’s KG-based approach in general and GraphDB in particular provided the health insurer with an advanced solution for consolidating various structured and unstructured data coming from multiple sources in a way that enabled easy discovery and analytics.
This case study concerns a socially significant disease and by providing this KG-based solution, Ontotext also hopes to contribute to the process of generating and evaluating novel treatments.
Ontotext’s solution was built for a very specific Healthcare insurance companies’ challenge, but the functionality is applicable to all types of domains as it is based on a generic technology.
Do you think this case resembles your particular needs?