EU-funded Project AI4EU-CODE Was Selected As a Success Story

Results achieved in AI4EU-CODE picked among 40 solutions as one of the top 4 success stories of the AI4EU project

Sofia/Bulgaria Friday, January 28, 2022

Ontotext is pleased to announce the successful finalization of the EU-funded project AI4EU-CODE – Classify Oncology Diseases: Español, which ran from 23 June, 2021 through 16 December, 2021.

The project addressed the AI4EU challenge “Identification of Colon Cancer Risk Factors”. Ontotext’s solution to thе challenge (shortly presented in this success story video) transforms unstructured clinical text (discharge letters) into structured knowledge, which can improve the quality of healthcare. The success story of the solution was also presented at several AI4EU events (SLUSH, AI4EU Stakeholders meeting, etc.).

AI4EU-CODE’s main objective is to deliver a method for automatic medical coding of clinical text in Spanish to codes from the International Classification of Diseases, revision 10 (ICD-10). The project’s text-based classification methodology incorporates state-of-the-art deep learning language models, adapted and fine-tuned for the specific domain and language. Two services have been developed – generic, which can automatically assign ICD-10 medical codes for a broad range of diagnoses, and specific, which is mainly focused on the Colorectal Cancer (CRC) use case and predicts ICD-10 medical codes for CRC and the associated diagnoses.

The achieved results are beyond the state-of-the art according to the accuracy comparison, based on the benchmark dataset CodiEsp with other available solutions.

Moreover, due to the selected deep learning language models that serve as a building block for the services, the developed solution works efficiently both for Spanish and English language as well as some other EU languages.

The task for medical coding is important in Healthcare and currently is done mainly manually for Healthcare administrative information, in Electronic Health Records, for Health Insurance and for Medical research.

The NLP services developed in the project can serve as a basic component that can be customized in various business solutions, based on the specific customer needs. Customizations can go in several directions:

  • Domain focus: based on the specific needs of the customer we can develop either general purpose (wide class coverage) pipelines or targeting specific narrow domains. Colorectal cancer was the focus domain as part of the project;
  • Terminology for normalization: some use cases demand specific terminology to be used for the data normalization, e.g., a recent request to use Australian ICD-10 for diagnostic medical coding of discharge letters;
  • Language: again based on the specifics of the use cases, different language coverage might be needed – Italian and Dutch (another research project); English (commercial project request for AU medical coding); all EU languages (request from top 10 Pharma company for processing of drug product brochures);

Along with the delivered solution, Ontotext had developed a methodology for training and adaptation of NLP pipelines to meet the specific needs for particular medical domains, terminologies and languages.

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under Sub-grant Agreement AI4EU Open Call for Solutions No 825619.


For more information, contact Doug Kimball, Chief Marketing Officer at Ontotext