Highlights from the “Mining Electronic Health Records for Insights” Webinar

October 29, 2015 4 mins. read Milena Yankova

On October 15, 2015, Todor Primov, a Healthcare expert with Ontotext, presented Mining Electronic Health Records for Insights: Beyond Ontology-Based Text Mining. This webinar highlighted some of the challenges in text mining clinical patient data and the solutions that Ontotext provides to overcome them, including:

    • Ontology-based Information Extraction
    • Application of flexible gazetteers
    • Negations detection
    • Temporality identification
    • Discovery of post-coordination patterns
    • Generation of Linked Data

The presentation also addressed many of the issues raised in our earlier blog post Overcoming the Next Hurdle in the Digital Healthcare Revolution: EHR Semantic Interoperability.

Discover more about Ontotext’s Healthcare and Life Sciences Industry Solutions!


Q & A from the webinar

During the webinar, Todor covered some of the challenges in applying Natural Language Processing over clinical patient data and some of Ontotext’s solutions.

Some really interesting questions were raised by the audience. Here is our selection:

Q: Pre-coordinated vs. post-coordinated vocabularies. Why are pre-coordinated vocabularies still used? Are there any advantages of pre-coordinated compared to post-coordinated vocabularies?

A: There are lots of pre-coordinated ontologies, which are primarily used for medical coding purposes, like ICD9-CM, ICD10-CM and ICPC. In many use cases, a particular medical observation must be identified and referred unambiguously. So for that purpose, a fully qualified concept will be needed and the pre-coordinated ontologies are a good reference source. Just the opposite, with the post-coordinated ontologies, we can model complex medical findings using relations between the “seed concept” and additional qualifiers or other classes of instances.

However, the post-coordination pattern definition approach requires to reference a finding not to a single concept, but to a relation between concepts. Some ontologies benefit from both approaches, such as SNOMED CT. It is always a trade-off which approach to apply and this is usually determined by the particular use case.

Q: How can we stop the explosion of possible mappings using flexible gazetteers? How many mappings are acceptable until they lose meaning for practitioners or domain experts?

A: To enrich our dictionaries, we use a predefined sequence of routines. Each routine performs a specific task and they follow an exact order, starting with applying particular ignore rules, rewrite rules and synonym/term inversion enrichment. The output from a routine serves as an input for the next step in the workflow. In each routine, there are multiple rules that are applied just once, so that the different routines in the workflow are not applied iteratively and there is no risk of “explosion”.

However, even applying each set of rules just once, this results in a significant increase of the literals compared to the initial set. It is always a good practice to validate the newly generated terms against a large corpus of domain-specific documents (like medical journal articles or anonymized EHR) in order to validate that the newly generated terms are naturally used by the medical professionals. The generated dictionary is used both by standard and the so-called flexible gazetteers. The flexible gazetteers are able to identify any term from the dictionary even its tokens are split with an additional token in the real text.

Q: Are you able to normalize all of the qualifiers to concepts from an ontology?

A: When we use post-coordination patterns to identify and fully specify a concept in the text, we use qualifiers that are already defined by an ontology. However, we have identified many cases in which we identify a qualifier in the noun phrase, but we cannot normalize it to a valid concept from an ontology. This requires to model your extracted data in RDF in a way that it will allow to store also the text/tokens which was not possible to be grounded to an ontology concept. It also requires new implementation of new approaches for exploration of the data extracted from text.

Q: How do you model relations between extracted entities?

A: If the extraction rules are defined for extraction of different concept classes and the relation between them, we model the semantics of the relation with the usage of special predicates. This is the case when we extract drug dosage information, where we identify a drug concept, a disease concept and the relation that the disease concept is an indication for the drug concept – in this example, we model the relation as drug “hasIndication” disease. Other more trivial relations in the knowledgebase are modeled using the SKOS schema – related, closeMatch or exactMatch based on their type of relations and the mechanism used to define the mapping.

Ready to face challenges in text mining clinical patient data with Ontotext’s Health Care and Life Sciences Industry Solutions?

Discover More

Article's content

A bright lady with a PhD in Computer Science, Milena's path started in the role of a developer, passed through project and quickly led her to product management. For her a constant source of miracles is how technology supports and alters our behaviour, engagement and social connections.

Linked Data Solutions for Empowering Analytics in Fintech

Read about how FinTech can use the power of Linked Data to put data into context and expose various links between these concepts.

Semantic Technology: Creating Smarter Content for Publishers

Learn how semantic technology helps publishers create better content publishing workflows and improved content consumption for readers.

The 5 Key Drivers Of Why Graph Databases Are Gaining Popularity

Read about the 5 key characteristics of graph databases – speed, meaning, answers, relationships, and transformation.

GraphDB Migration Service: The 10-Step Pathway from Data to Insights

Welcome to our GraphDB Migration Service that helps you prepare for migrating your data to GraphDB, walks you through the setup and monitors performance.

Fighting Fake News: Ontotext’s Role in EU-Funded Pheme Project

Read about the EU-funded project PHEME aiming to create a computational framework for automatic discovery and verification of information at scale and fast.

Semantic Technology: The Future of Independent Investment Research

Learn how independent research firms use cutting-edge technologies to add value to research pieces and monetize the content they offer.

Top 5 Semantic Technology Trends to Look for in 2017

Read about the top 5 trends in which Semantic Technology enables enterprises to make sense of their data and fine-tune their offerings to customers.

Ontotext’s 2016: Our Top 7 Webinars Of The Year

Data shows that in 2016 we had a total of 22 webinars that attracted over 7 000 people – here are the 7 best webinars!

Ontotext’s 2016: What Did You Liked The Most On The Blog

Nearly 10 000 people read our blog in 2016 and the following 5 posts gathered most interest.

Linked Data in Regtech: Boosting Compliance and Performance

Learn how regulatory technology, coupled with semantic technology, can help enterprises and financial institutions reduce exposure to risk.

How Data Integration Joined the Music Hit Charts

Learn how today it is the Internet, data integration, and tailored recommendations that stage the music scene for the new Bob Dylans.

Open Data Innovation? Open Your Data And See It Happen

Learn how open data trend-setting governments and local authorities are opening up data sets and actively encouraging innovation.

Linked Data Innovation – A Key To Foster Business Growth

Learn how freely available and machine-readable Linked Open Data enriches organizations’ data and helps them discover new links and insights.

Linked Data Approach to Smart Insurance Analytics

Read about how Linked Data and semantic technology can enrich data and pave the way to advanced analytics.

Linked Data Paths To A Smart Tourism Journey

Read about how the tourism industry can benefit from Linked Data and big data analytics for wiser investments and higher profits.

Linked Data Pathways To Wisdom

Learn about the linked data pathways to wisdom through ‘who’, ‘what’, ‘when’, ‘where’, ‘why’, ‘how to’ and, finally, ‘what is best’.

Taking Semantic Web to its Next Level with Cognitive Computing

Learn about the new age of cognitive computing and integrating its concepts into two decades of semantic web growth.

Open Data Play in Sports Journalism And EURO 2016

Read about how open data gives those modern-day Sherlocks the bases of their stories.

Open Data Sources for Empowering Smart Analytics

Learn how Open Data and how more businesses use data analytics to gain insights, predict trends and make data-driven decisions.

Journalism in the Age of Open Data

Learn how governments and authorities can start relying more on journalism to promote the use of open data and its social and economic value.

Building Linked Data Bridges To Fish In Data Lakes

Learn how enterprises can build bridges to extracting more powerful and more relevant insights from their Big Data analytics.

Open Data Use Cases In Five Cities

Learn how London, Chicago, New York, Amsterdam and Sofia deal with open data and extract social and business value from databases.

ODI Summit Take Out: Open Data To Be Considered Infrastructure

Learn about The ODI’s second Summit with prominent speakers such as Sir Tim Berners-Lee, Martha Lane Fox and Sir Nigel Shadbolt.

Highlights from the “Mining Electronic Health Records for Insights” Webinar

Read some of the Q&As from our webinar “Mining Electronic Health Records for Insights”.

Highlights from ISWC 2015 – Day Three

The 14th International SemanticWeb Conference started three days ago and Ontotext has been its most prominent sponsor for 13 years in a row.

Highlights from ISWC 2015 – Day Two

The 14th International SemanticWeb Conference started three days ago and Ontotext has been its most prominent sponsor for 13 years in a row.

Overcoming the Next Hurdle in the Digital Healthcare Revolution: EHR Semantic Interoperability

Learn how NLP techniques can process large volumes of clinical text while automatically encoding clinical information in a structured form.

Highlights from ISWC 2015 – Day One

The 14th International SemanticWeb Conference started three days ago and Ontotext has been its most prominent sponsor for 13 years in a row.

Text Mining to Triplestores – The Full Semantic Circle

Read about the unique blend of technology offered by Ontotext – coupling text mining and RDF triplestores.

Text Mining & Graph Databases – Two Technologies that Work Well Together

Learn how connecting text mining to a graph database like GraphDB can help you improve your decision making.

Semantic Publishing – Relevant Recommendations Create a Unique User Experience

Learn how semantic publishing can personalize user experience by delivering contextual content based on NLP, search history, user profiles and semantically enriched data.

Why are graph databases hot? Because they tell a story…

Learn how graph databases like GraphDB allow you to connect the dots and to tell a story.