Read about Ontotext’s GraphDB Version 9.0 and its most exciting new feature - open-sourcing the Workbench and the API Plugins.
This post presents COVID-19-related projects currently using GraphDB:
We will maintain this list to increase the visibility of those projects and help the scientific community use their results and collaborate.
Ontotext’s policy is to donate to such projects licences for GraphDB Enterprise Edition as well as support and maintenance services.
Request GraphDB For COVID-19 Research
In only a few months, the COVID-19 pandemic has swept across the world, spreading to more than 200 countries and territories. The coronavirus outbreak has already claimed thousands of lives, grounded flights, canceled sports events and concerts, and plunged economies into recession as countries went into lockdown.
As all this progresses, the scientific community races against time to respond to the pandemic by developing diagnostic tests, therapies, pre-clinical and clinical research and vaccines. Different organizations across many countries are joining forces to face the pandemic and its global consequences.
There is no greater case for collective action than our joint response to COVID-19 – we are in this together and we will get through this together, says António Guterres, Secretary-General of the United Nations.
One of the big challenges for the scientific community in the current situation is the vast volume of data that is constantly produced from various sources in several domains.
Making sense of messy data from disparate sources is what Ontotext does best due to longtime experience with many clients and projects. Therefore, Ontotext is making its small but powerful contribution by supporting global COVID-19 related initiatives with its technology. Several research projects are already using GraphDB – Ontotext’s leading RDF database for creating knowledge graphs.
Knowledge graphs are collections of live, richly interconnected, machine-processable knowledge that use formal semantics and automated reasoning to enable deep analytics. Their ability to derive new knowledge out of existing facts and uncover hidden relationships make them best suited to analyze rapidly changing data from disparate sources.
The FHIRCat group at the Mayo Clinic has published the CORD-19-on-FHIR dataset for COVID-19 research. CORD-19-on-FHIR aims to enable the semantics of FHIR and terminologies for clinical and translational research.
The FHIRCat group started using GraphDB and Ontotext offered assistance for setting up GraphDB enterprise and tuning up the performance in querying the SPARQL end-point of the public service. The initial dataset consisted of 13,202 journal articles relevant to novel coronavirus research. It was represented in FHIR RDF to facilitate semantic linkage with other biomedical datasets and was extended by adding the following semantic annotations:
The FHIR RDF version of CORD-19 plans to use the PICO ontology for modeling the annotations and to store them back in GraphDB.
The CORD-19-on-FHIR dataset, licensed to encourage open COVID-19 research, is available on github and any further collaboration is encouraged.
The Spatio-Temporal Knowledge Observatory (STKO) Lab in the Geography Department of the University of California, Santa Barbara (UCSB) has started integrating into an open research knowledge graph information relevant to disruptions in the air traffic and supply chains related to COVID-19.
The linked dataset tries to provide researchers with a better comparative overview on the current situations and is constantly updated. The following data is published and available for exploration and querying in GraphDB:
GraphDB’s Visual Graph can be used to explore the data as demonstrated below.
As this type of data is very dynamic, the flexibility of knowledge graphs and their capacity to seamlessly integrate data from disparate sources provides researchers with valuable live insights into the COVID-19 pandemic and its consequences.
Krzysztof Janowicz, director of STKO, emphasized how important it is to have this data properly aligned to the geographic regions. As most of this data is relevant to specific regions, it is very important to be able to traverse sub-region relationships in order to aggregate information, discover correlations and other types of analysis. An example of the sort of linked data reasoning that can be employed here is that if quarantine and social distancing measures are in place for a region, then a community that’s part of this region will be subject to those same restrictions, so you don’t need to materialize everything in the graph
This project is also featured in the list of projects using knowledge graphs in the fight against COVID-19, which were presented at the meetup “Knowledge Graphs to Fight COVID-19“.
Cochrane, an international NGO for organizing medical research findings, is developing the Cochrane COVID-19 Study Register – an application for collating and navigating COVID-19 living evidence. Ontotext’s knowledge graph technology is at the core of Cochrane’s data architecture developed by our partners from Data Language.
In his blog post “How knowledge graph technology is helping Cochrane respond to COVID-19” Paul Wilton presents in great detail the data modelling principle and the software architecture behind the register. Here, GraphDB is used for storing the ontology models, the vocabulary, the content metadata and the graphs from the PICO ontology. For the integration and curation of the linked data vocabulary and the PICO graphs, Cochrane uses a combination of GraphDB with ElasticSearch. Using SPARQL queries, researchers can find studies, reviews and meta-analyses with similar fingerprints or other patterns in the data.
GraphDB is also used by some of the participants in the COVID-19 Open Research Dataset Challenge (CORD-19) organized by Kaggle, the largest online community of data science and machine learning. The Challenge is an appeal to AI professionals to develop text and data mining tools that can help the medical community develop answers to high priority scientific questions.
The provided dataset CORD-19 is a full-text and metadata dataset of COVID-19 and coronavirus-related research articles optimized for machine readability. It contains more than 51,000 scholarly articles and is available to the global research community.
We presented above some of the research projects currently using GraphDB to manage their data and content for rich analytics. We will update this list as new projects come.
We at Ontotext firmly believe that all these initiatives will strengthen collaboration and will facilitate the research community in finding solutions to the COVID-19 global threat.
If you think knowledge graphs and GraphDB can help you in your COVID-19 related research, don’t hesitate to contact us!