Read about the significant advantages that knowledge graphs can offer the data architect trying to bring a Data Fabric to their organization.
Question: Which of these two statements is true?
Answer: Both are true.
It is true that the healthcare sector, which includes hospitals, pharmaceuticals, and insurance companies, have an enormous amount of data. They have to because people’s lives are at stake. Results need to be guaranteed with rigour, reproducibility, accuracy, etc. Compared with other industries, healthcare has a fair amount of structured data, which is helpful. The richness of data, if it can be discovered, enables the discovery of novel therapies, causal relationships or, just as important, retrieving existing negative results so that the company doesn’t spend millions of dollars to discover what is already known not to work.
And it is true that the amount of data in the healthcare sector is a problem. The data inevitably exists in a variety of formats, both in structured datasets and unstructured free-text research papers. The data will be fragmented across a variety of systems across the globe. There will be mislabels, typos, redundancies and semantic ambiguity. All slowing research progress. It is also a moving target because even as you conduct your research, more data and sources are being added. Existing systems and manual processes cannot cope. What is needed is a technology that can extract and retain the meaning of any new knowledge as well as being able to provide provenance for each underlying fact supporting the scientific conclusions.
Ontotext’s knowledge graph technology is perfect for addressing the disadvantages and maximizing the advantages of large, diverse datasets. Ontotext has years of experience transforming data into knowledge across a number of industries and especially in the healthcare sector. By the addition of formal semantics to data, both machines and humans can process the data correctly and efficiently through better search and analytics and the automation of knowledge tasks. By using open standards such as RDF, data can be interconnected at the semantic level both internally and globally. It is from those connections that new discoveries are made.
Ontotext has worked with healthcare clients to create tools to discover and evaluate new therapies for age-related macular degeneration and idiopathic pulmonary fibrosis. They also developed a large-scale knowledge graph for an early hypothesis testing tool.
Typical of healthcare use cases, this required pulling together concepts from genomics, proteomics, metabolomics, disease conditions, drug products, scientific literature and various biomedical ontologies, integrated information from a variety of open datasets. In addition, Ontotext developed specialised natural language processing (NLP) services to semantically annotated corpora of scientific literature covering genes, diseases, compounds and drugs as well as identifying the generic relationships between these concepts. Without automating these processes, organizations would never be able to process volumes of research to deliver a comprehensive picture of a disease’s mechanisms to generate novel hypotheses for therapies and better health outcomes for patients.
This is where experience counts and Ontotext has a proven methodology for semantic data modeling that normalizes both data schema and instances to concepts from major ontologies and vocabularies used by the industry sector. They have tuned their NLP services to identify biomedical concepts and relationships in the unstructured texts coming from scientific journals. These services extract new terms that can be fed back to the knowledge graph, thus further enriching the knowledge graph and system performance.
The knowledge graph seamlessly connects proprietary internal data with open public data to provide a single comprehensive view. Ontotext’s GraphDB’s unique inference capabilities infer new knowledge from existing facts, which adds extra explanatory power to their knowledge discovery. Applying logical rules on the data at scale turns the disjointed pieces of information coming from different sources into a richer network of knowledge. This allows researchers to follow the providence of promising correlations and facts and to gather the insights they need to find the solution they are looking for amongst datasets too large and diverse to do otherwise.
In the healthcare sector, our clients gain:
The volume and diversity of data is a challenge for companies and organizations across sectors. Ontotext’s years of working with the data-intensive healthcare sector has provided real world experience that can be applied widely.