Read about how predicted links between genes and disease relations can help speed up target validation, drug development and provide novel insights into disease biology
Recent statistics shed light on the realities in the world of current drug development: out of about 10,000 compounds that undergo clinical research, only 1 emerges successfully as an approved drug. This disheartening success rate is the result of the immense challenges faced by the Pharma industry today.
To make things worse, the waiting period from bench to patient bedside now exceeds 12 years. The long wait comes from the need for extensive testing in order to ensure that a drug is safe and efficient before it can be available to those who need it.
One of the main hurdles to drug development is that companies often try to build biological hypotheses that are not based on solid scientific evidence. The current process involves costly wet lab experiments, which are often performed multiple times to achieve statistically significant results. Or there’s a speculative bulk screening of many different options until one works, however, without a proper understanding of the underlying pathological mechanisms.
But what if we reverse this process? What if we first find out how a disease occurs, what pathological processes are involved on the cellular and molecular levels, which genes and proteins play a role and how they interact? Then we would be able to build strong hypotheses and run validation experiments faster, cheaper and more evidence-based. This will minimize the risk of investing resources in fundamentally wrong assumptions and will give researchers more confidence to choose top targets for validation.
Another problem in drug development is that although companies gather a lot of data from multiple databases, they struggle to derive key insights about safety, drugability, etc. They need some kind of compass to find their way in the data jungle and identify information that is useful and relevant to their specific use case.
As already mentioned, the main challenge for the target discovery process is getting insights relevant to a specific therapeutic area of interest quickly and efficiently. The process also creates a huge backlog of targets to be screened and validated. This means lower success rates, exceeded research budgets, loss of competitive edge and more inefficiencies – all of which impact the business side of research.
Target discovery also involves a lot of collaboration between researchers and technical teams (data scientists, computational biologists, etc.). Technical teams struggle to retrieve information and provide it to researchers and presently it costs them significant time and effort.
Researchers have to sift through huge amounts of data, dispersed over multiple databases as well. They frequently spend hours reading through hundreds of publications to find new insights and then confirm them with structured information. On top of that, data is sometimes unreliable, and inaccurate or missing metadata makes it hard to decide which information to trust.
Ontotext’s Target Discovery solution addresses many of these problems and challenges. It helps researchers work with tons of scientific data, collaborate and discover new insights with confidence.
Ontotext’s Target Discovery consumes data using our LinkedLifeData Inventory – it offers over 200 integration- and analytics-ready datasets. They can be easily maintained in a knowledge graph and can be plugged in and out depending on the customer’s use case.
We use these data sources to normalize the data we find in research articles. Our collection incorporates over 80 million research articles, including patents and clinical trials. We extract this information automatically with custom AI algorithms and integrate it into the knowledge graph. This saves valuable time and effort for both research and technical teams. All derived facts can be further put into context with structured data, which improves data quality and presents researchers with clear evidence and provenance for all insights
Then, Ontotext’s Target Discovery provides deeper insights into the data stored in this highly-interlinked knowledge graph, where long sequences of relations can be mined. It enables one access point across multiple datasets and makes it easy to find information quickly and efficiently. The data model can adapt effortlessly to the iterative development of use cases across different therapeutic areas. It can also be easily maintained and upgraded, which lowers the costs of adding new data and maintaining data updates.
Our Target Discovery solution helps researchers and technical teams to collaborate. It facilitates finding information and exploring it visually and enables deeper data analytics based on graph and AI algorithms. The solution also eliminates the need for extensive programming knowledge and deep technical skills. It allows researchers to leverage machine learning algorithms to discover new facts or predict new links.
There is also automatic ranking that makes it easy for users to decide which therapeutic candidates are worth pursuing, based on predefined criteria. The confidence score and provenance metrics enable researchers to trace back any decisions and have more confidence to make evidence-driven decisions.
Let’s have a look at a real-life case study. Our customer is a U.S.-based company that works on cancer immunotherapies. Before implementing our solution, they did a lot of the work manually. They had to read hundreds of scientific papers daily, search in various databases such as NCBI, UniProt, and STRING DB, and collect all of this information in an Excel spreadsheet so that they could collaborate in their research group.
In the process, they managed to identify hundreds of potential therapeutic candidates, which could theoretically be used to induce anti-cancer effects on the molecular level. It took them several hours to a couple of days to evaluate the hypothesis for each candidate and uncover sufficient scientific evidence to move forward.
After choosing our Target Discovery solution, they selected approximately 35 datasets they used on a daily basis and that fitted their scientific needs. We integrated these datasets into a knowledge graph and created hundreds of mappings between them so the customer could easily analyze the network of knowledge, infer new gene-disease relationships and backtrack the evidence for all new insights. The solution helped them drive innovation from previously uncovered relationships as well as significantly speed up their hypothesis generation and evaluation by 1000%.
We also applied tailored Natural Language Processing (NLP) pipelines on 80 million documents and provided the researchers with more than 500 million additional relations. These were all mapped in the knowledge graph, delivering more evidence and new facts. All AI-derived data is provided with provenance and accuracy so that the research teams can easily take crucial decisions based on solid evidence and accuracy metrics.
In the fast-paced world of drug discovery, the challenges faced by researchers and their technical teams impact the lives of patients. Speeding up the process and ensuring safe and efficient therapies is a game-changing endeavor and a necessity.
Ontotext’s Target Discovery solution offers a novel and collaborative approach to streamline the process. By leveraging public datasets and extracting valuable insights from scientific literature, it empowers researchers and technical teams to leverage cutting-edge technologies like knowledge graphs and NLP. Its user-friendly and customizable nature enables users to explore the data visually and provides metrics that automate target ranking and reduce risks.
We’ll continue to develop the potential of our solution to help researchers easily navigate the complex data landscape and discover safe and efficient therapies.