How Pharma Companies Can Scale Up Their Knowledge Discovery with Semantic Similarity Search 

The Key to Efficiently Processing Large Volumes of Regulatory Authorities’ Questions and Quickly Sending the Answers

March 6, 2020 4 mins. read Milen Yankulov

Pharma has deep roots in human history with centuries of folk pharmaceutical knowledge offering a hit-and-miss range of natural remedies. But the industry as we know it today actually emerged in the second half of the 19th century when the world’s first factory for the sole production of medicines was found.

By the late 19th and early 20th century, some chemical companies had already begun using research labs to explore the medical applications for their products. Fast forward to today and the pharmaceutical sector is a global trillion-dollar industry. However, to ensure the safety and efficacy of drugs, the process of drug discovery and development is under extensive scrutiny and control on both national and global levels.

Regulatory Landscape Challenges for Pharmaceuticals

On top of having to comply with stringent and detailed regulatory requirements, Pharma companies also have to respond to a lot of questions from different regulatory agencies and do it within a short period of time. Typically, such companies have archives with a significant number of already answered questions. But the huge amount of data gathered over the years is in various formats (mostly unstructured) and stored on various systems, which makes the retrieval of the information costly, time-consuming and inefficient.

Essentially, whether a regulatory authority will view a company as trustworthy and competent will depend largely on these responses to compliance questions. Submitting an incomplete and untimely response can result in creating a negative impression, impact the company’s finances and even lead to regulatory action. Therefore pharmaceutical companies are in dire need of a system that goes beyond the conventional search technologies, which are increasingly failing to address their needs.

Scaling Up Information Extraction for Greater Value and Less Response Time

To enable pharmaceutical companies to quickly process large volumes of questions from regulatory agencies, Ontotext has developed a smart solution, which makes information extraction easier, faster and much more efficient.

Read our case study: A Global Pharma Company Uses Ontotext’s Solution for  Semantic Similarity Search in Documents!


First of all, this solution is able to ingest large amounts of various documents in various formats and to automatically extract and classify pairs of questions and answers. Then, the content of the questions is semantically indexed, which enables the system to compare the new questions to any previous questions stored in the database.

From this processed data a knowledge graph (KG) is created. It represents the relationships between the different elements of the document and empowers a semantic search. This type of search goes beyond the traditional keywords and is more intuitive and context-aware, which allows it to disambiguate concepts. Because of its highly interlinked nature, it can also recognize multiple references to one and the same entity.

The Power of Semantic Text Similarity

Ontotext’s solution uses one of the latest features of their signature semantic graph database GraphDB – the semantic similarity plugin. Thanks to this pugin, GraphDB’s semantic text similarity search matches words across documents that co-occur with other words in the same context. Then it returns the top 10 most similar Q&A pairs from the database. As a result, Pharma company analysts save a lot of time and effort and can easily reuse company knowledge.

As pharmaceutical manufacturers strive for a high standard of public trust, while fostering innovation and working to enhance public health, they are turning more and more to AI-based solutions like Ontotext Semantic Similarity Search in Documents. Such technologies will empower them to face up to some of the industry changes in regulation well into the future.

Although this particular solution was developed for a very specific Pharma Regulatory use case, the system’s functionality applies to all types of domains because it is based on a generic technology.

Are you facing similar problems?

New call-to-action

Article's content

Marketing Manager at Ontotext

Milen Yankulov has a vast experience in both traditional and digital marketing communications. His professional interests are related but not limited to Web and News Medias, Semantic Search and Social channels and all digital disruptions that change the way we communicate and do business.

Reflections on the Knowledge Graph Conference 2023

Read Milen Yankulov’s impressions from the conference, Ontotext positioning, the role of ML, AI & LLM in the graph space and more

Ontotext’s Top 5 Most Popular Blog Posts for 2020

Read about another busy year at Ontotext in our traditional round-up of the most popular blog posts we have published throughout 2020.

Johnson Controls Selects Ontotext’s GraphDB for the New Version of Metasys Building Automation System

Johnson Controls selected GraphDB to provide semantic data creation and management for their Metasys system – a Top-5 Integrated Building Management System.

The Importance of FAIR Data Principles in Healthcare & Life Sciences

Read about FAIR data principles – a relatively new concept for data discoverability and management that has quickly gained traction among the scientific data community and policymakers.

Boosting Cybersecurity Efficiency with Knowledge Graphs

Read about how a live knowledge graph helped a cybersecurity and defense company easily integrate new data sources and efficiently navigate their dynamically updated information.

Computer Vision Technology for Boosting Retailers’ Marketing & Product Management  

Read about how Ontotext’s customer demographic analysis solution, based on computer vision, helps retailers track and analyze customer traffic and behavior in stores.

Knowledge Graph Conference 2020 Recap: Knowledge Graphs Are Getting Into the Limelight

Read about KGC 2020 and how knowledge graphs-based technologies continue to advance into mainstream enterprise operations.

GraphDB Empowers Scientific Projects to Fight COVID-19 and Publish Knowledge Graphs

Read about COVID-19 related research projects, which are currently using Ontotext’s GraphDB.

Ontotext’s GraphDB Builds a Thriving Community of Expert Followers

Read about the thriving community GraphDB has generated over the years and the insights and experience they share in many blog posts and tutorials.

Ontotext Knowledge Graph Platform: The Modern Way of Building Smart Enterprise Applications

Read our article about Ontotext Platform, originally published in a special report “Empowering Machine Learning with Knowledge Graphs” by DBTA magazine.

How Pharma Companies Can Scale Up Their Knowledge Discovery with Semantic Similarity Search 

Read about how semantic similarity search helps Pharma companies efficiently process and answer large volumes of Regulatory Authorities’ questions.

How Computer Vision Technology Can Bring Smart Surveillance to Retail    

Read about how Computer Vision technology can provide efficient face recognition to identify known and potential offenders in retail stores.

Ontotext’s Graph Database Helps Create EU-Wide Company Business Graph

Read about the EU-funded project euBusinessGraph aiming to compile, integrate and analyze business data from various public and private sources.

Ontotext’s Most Popular Blog Posts for 2019

Read about another busy and exciting year at Ontotext in our traditional countdown of the most popular blog posts we have published in 2019.

Semantic Technology and the Strive for Drug Safety

Learn about Ontotext’s solution for tracking and collecting drug safety data, based on text analysis and knowledge graph technology.

Semantic Technology-based Media Publishing Boosts User Engagement

Read about how the more media publishers know about how users consume their content, the more relevant their content and ad recommendations will be.

Smart Analysis of Pharma Research Literature Makes Novel Therapy Identification Easier

Learn how knowledge graphs help discovering novel therapies by identifying new patterns and discovering previously unknown links between drugs and potential treatments.

Smart Negative News Monitoring Makes Banks’ KYC Process More Efficient

Read about how knowledge graph-based negative news monitoring, as part of a smart KYC process, provides a fully automated workflow for financial institutions and helps them comply with existing regulations and avoid reputational risk.

Semantic Search for Smart Data Discovery in the Pharma Industry

Read about how Ontotext’s smart semantic search solution enables users to easily find relevant information across huge volumes of siloed data-sources and get better knowledge insights from more efficient data management and discovery.

Top 5 Technology Trends to Track in 2019

Ontotext’s review of the top 5 technology trends as we expect to continue making their mark on the way companies gain faster and better insights.

Ontotext’s Top Webinars for 2018

Read on to see how Ontotext’s top webinars for 2018 helped businesses with knowledge discovery thanks to graph analytics and AI-powered services.

Ontotext’s Most Fascinating Blog Posts for 2018

Read about another busy and exciting year at Ontotext in our traditional round-up of the most fascinating blog posts we have published throughout 2018.

Ontotext’s GraphDB Powers UK Parliament’s New Data Service

Read about UK Parliament’s new data service and how it modernizes the way it consumes and shares data.

Q&As from Our Webinar: Graph Analytics on Company Data and News

Read some Q&As from our webinar: Graph Analytics on Company Data and News, presented by Atanas Kiryakov, CEO of Ontotext.

Top 5 Semantic Technology Trends to Track in 2018

As we are going into 2018, here is Ontotext’s list of the top 5 semantic technology trends to keep an eye on.

Your Favorite Ontotext Blog Posts for 2017

As we roll into the New Year 2018, our readability count distilled the following 5 favorite posts for 2017.