AstraZeneca: Enabling Early Hypotheses Testing Through Linked Data

As part of what’s known as the LarKC project, AstraZaneca and Ontotext collaborated to build a large knowledge graph to support building a holistic view in the field of translational medicine. The resulting graph integrates various categories of information into a unified, explorable network of knowledge. The built in causal relations ontology allows exploration of distant (indirect) relations between objects that are not obvious in a single data source, creating a platform that enables a foundation for early hypotheses testing.

  • Reduced time and effort in identifying relevant relationships
  • Increased efficiency by easily spotting and separating relevant relationships & resolving uncertainties
  • Improved user experience with an easy and intuitive tool for mining and exploration 

The Goal

AstraZeneca needed to develop a platform for Interactive Relationship Discovery to enable the identification of long causal relationship chains between the biomedical objects in the Linked Life Data cloud. The industry-specific platform was to be used for early hypothesis testing, which requires identifying direct and non-direct relationships between biomedical entities and suggesting possible mechanisms that usually remain hidden.

To facilitate the process of relationship discovery, the platform needed to provide an easy and intuitive tool that would allow the researchers to interactively mine and explore causal relations.

The Challenge

In the pharmaceutical research and discovery process, success is highly dependent on the availability and accessibility of high-quality research data. The quality of the data can be assessed by its accuracy, correctness, completeness, currency, and relevance. The accuracy and the correctness of data are defined by the methods used to generate the data. However, the latter three – completeness, currency, and relevance – could be determined partially or completely by an effective semantic data integration approach, which:

  • aggregates all relevant information
  • removes redundancy and ambiguities in the data
  • interlinks the related entities

Researchers gather information from a broad range of biomedical data sources in an iterative way in order to generate or expand a certain theory, to test hypotheses and make informed assertions about which relationships are causal and exactly how they are causal. They need a mechanism that allows them to mine all the data scattered among different resources and to identify visible (direct) and invisible (distant) relations between biomedical entities studied in the pharmaceutical research and discovery process.

The Solution: Linked Life Data Cloud

Semantic warehousing helps researchers get an overview of the existing relationships within scientific and clinical data by utilizing causality data mining. Linked Life Data is used as a platform for Interactive Relationship Discovery between biomedical entities as it:

  • integrates over 25 diverse data sources;
  • aligns the data to more than 17 different biomedical objects (genes, proteins, molecular functions, biological processes/pathways, molecular interactions, cell localization, organisms, organs/tissues, cell lines, cell types, diseases, symptoms, drugs, drug side effects, small chemical compounds, clinical trials, scientific publications, and more)
  • identifies explicit relationships between entities locked in the original data sets and categorizes them to a causality relationship ontology
  • mines unstructured data in order to identify relationships hidden within the text (inclusion/exclusion criteria for clinical studies)


Since the entities in Linked Life Data are usually strongly interlinked, generally the approach for simply crawling/querying the repository for relationships and listing them is not sufficient. That’s why Linked Life Data also provides a user-centered process and interactive tools for assisting the discovery of even very large numbers of causal relations.

relfinder_0-2Business Benefits

  • Efficiently gain an overview of the identified causal relationships between biomedical objects
  • Interactively explore these relationships
  • Easily spot and separate relationships that relevant in a certain use case

Why Choose Ontotext?

With Ontotext’s Linked Life Data Inventory, researchers at AstraZeneca can increase their efficiency and cut time and resources on exploring relationships. As a result, the biopharmaceutical company can now quickly resolve uncertainties about the early development of drugs with the help of data-driven testing of hypotheses.

Do you think this case resembles your particular needs?

New call-to-action

Contact Us Now