AIDAVA: AI powered Data Curation & Publishing Virtual Assistant

  • Active
  • Programme: Horizon Europe
  • Call: HORIZON-HLTH-2021-TOOL-06
  • Start date: 01.09.2022
  • End date: 31.08.2026

AIDAVA (AI-powered Data Curation & Publishing Virtual Assistant) is a project funded by the European Union’s Horizon Europe Framework Programme for Research and Innovation, call “Tools and technologies for a healthy society (2021)” (HORIZON-HLTH-2021-TOOL-06).

AIDAVA aims at unlocking the full potential wealth of knowledge siloed inside Healthcare systems by enabling interoperability, AI-readiness and reuse-readiness of heterogeneous personal health data (PHD) on institutional, national and EU scale.

Project website:
CORDIS website:

Contact: Svetla Boytcheva

As the age of big data brings abundance of information the (semi-manual) curation of useful data grows ever more cumbersome for data stewards. On the other hand, the curation process is also hindered by the siloing of data in separate systems and formats that limits the context within which particular data is examined.

AIDAVA addresses both those problems by adopting a universal data representation based on ontology standards, by increasing the degree of automation of data quality enhancement & FAIRification, and by employing SOTA AI models for information extraction as well as for comprehensible explanations of the whole process to the end-user.

The result is an AI-powered virtual assistant available in 3 European languages (German, Dutch and Estonian) that automates the workload of data stewards around breast cancer patient registries and longitudinal records of cardio-vascular patients. In the long-term, the use of this virtual assistant can be expanded towards patients/non-experts, thus leading to democratization of participation in data curation, decrease in overall Healthcare costs and support towards the European Health Data Space.

Ontotext’s Role

Over the last two decades, Ontotext has developed advanced tools for semantic data normalization and Natural Language Processing (NLP) for Life Science and Healthcare related projects both in commercial and research settings. Within AIDAVA, Ontotext leads the way in defining a global data standard, works on the development of novel DL/NLP tools for text-based content and novel machine learning (ML) and knowledge graph (KG) based tools Harmonization and Quality Enhancement of structured data as well as is responsible for the communication, dissemination and exploitation of the project.

We employ language-independent approaches to extraction of relevant entities and their relations from text to obtain personal knowledge graphs that are afterwards harmonized and standardized into a unified knowledge graph through entity linking, disambiguation and the use of international classifications & ontologies standards.

Further, we develop tools for measurement of FAIR and data quality metrics, use them to improve the quality of the data and publish the resulting enhanced data in a FAIR registry.

Finally, Ontotext is also responsible for AIDAVA’s outreach towards the public, scientific and business communities. Towards these goals we maintain AIDAVA’s online and public presence, coordinate the consortium’s outreach activities and participation in conferences, workshops and other networking and knowledge sharing events.

This project has received funding from the European Union’s Horizon Europe research and innovation programme under grant agreement No 101057062. Views and opinions expressed are however those of the author only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible for them.

Ontotext Newsletter