Natural Language Querying (NLQ) enables users to interact with complex databases – yes, including knowledge graphs – using ordinary human language, eliminating the need for specialized query language skills. Over the past year, NLQ has skyrocketed into prominence, proving to be an indispensable tool in unlocking the staggering potential of structured and unstructured data alike.
With the recent advancements in generative models, the research field of NLQ has been dominated by approaches using large language models (LLMs) to understand human questions and provide natural language answers. LLMs and conversational interfaces have clearly demonstrated the benefits of exploring and extracting information easily from these extremely large knowledge structures.
They opened the door for a next-level user experience when it comes to how to search and consume new knowledge. Enterprise organizations possessing a lot of data hosted in various data stores, including knowledge graphs, seek to enable such interfaces for consuming the information represented there. The goal is to enhance knowledge discovery and enable non-technical users or employees to benefit from all the information for knowledge-driven decision-making.
So, how do LLMs help in tackling NLQ tasks?
Following is a list of recently emerging approaches for NLQ with the help of LLMs:
Many libraries and tutorials are emerging with helpful tools and guidelines to speed up such development, the most fast-growing out of which is currently LangChain.
All of these techniques lead to empowering the out-of-the-box LLM to answer questions better, even those that require external knowledge. The field of NLQ is quickly advancing, but there are still major challenges:
Knowledge graphs provide a helpful addition to these approaches thanks to the structured knowledge representation in the form of ontologies as well as the contextual richness and connectedness of the data, which enables semantic reasoning. These features can be particularly useful in domain-specific NLQ systems where understanding the specific terminology and relationships is crucial. Semantic technologies and knowledge graphs can enhance LLMs when it comes to extending the context for LLM in a rich, accurate, and transparent manner.
Ontotext’s products cover the full spectrum of foundational technologies to set you up to speed with your NLQ development.
Ontotext’s Semantic Objects provides a GraphQL interface on top of the knowledge graph data that is very easy to pick up by LLMs. GraphQL is one of the most developer-friendly interfaces to consume knowledge graphs. Semantic Objects provides access to the schema of the data in a readable YAML format, which makes it easy for an LLM to parse and understand the model of the data. Alternatively, for bigger schemas, a dynamic schema introspection can be integrated into the query generation process.
To fine-tune an LLM model for NLQ, you need a high-quality training dataset with questions and answers pairs on a representative sample of your proprietary knowledge. Ontotext Metadata Studio (OMDS) is the tool for making LLM fine-tuning easy via:
Using OMDS, general-purpose LLMs can be turned into specialized ones based on proprietary content, domain model, and expertise. A produced training dataset in a custom domain enriched with high-quality semantic metadata can teach the user’s LLM instance to excel in the new area. With all these features made easier and right on top of your knowledge graph, OMDS is the perfect companion for this job.
NLQ has great potential in a wide range of industries such as Healthcare, Financial Services, Infrastructure, and Manufacturing. In reality, few of the domain experts in each of these areas actually possess a deep technical understanding of query languages in order to take full advantage of the graph data for their day-to-day analysis. By combining semantic graph technologies with LLMs, you can make knowledge graphs easier to enrich, consume, and understand, so the value they unlock can be democratized to a wider public.
In Healthcare, for example, semantic technologies and knowledge graphs improve personal healthcare by providing a comprehensive patient dossier for clinical decision-making and enabling access to high-quality data for clinical research. NLQ can be used to quickly retrieve patient information or research data, simply by asking questions like “What were the patient’s last lab results?” or “What are the recent studies on this disease?”. This can make data management more efficient and can help in making timely diagnosis and treatment decisions.
In Financial Services, knowledge graphs allow organizations to derive more value from their data by capturing unique relationships between data points, which is crucial for complex queries and analytics. NLQ enhanced by knowledge graphs can help data analysts trace and understand the origins of the information and access real-time insights to make informed knowledge-driven decisions.
Another example is Industry and Manufacturing, where knowledge graphs help bridge the gap between different industry sectors by revolutionizing how data is structured and analyzed, leading to better knowledge management and process automation. By querying systems in natural language, users can quickly find information relevant to their tasks, be it maintenance and troubleshooting, supply chain management, or security and compliance.
In all of the areas above, the aspect of knowledge sharing and collaboration across organizations is key. NLQ can facilitate the discovery and access to relevant information, expertise, best practices, and lessons learned, thus promoting collaboration, encouraging knowledge exchange, and preventing silos of information.
It is no wonder why the emergence of LLMs has generated such unprecedented hype – people feel empowered by how accurately a machine can understand and meet their needs. Such a seamless experience will be expected and demanded by users of more and more applications around us. Knowledge graphs provide key features to unlock NLQ capabilities powered by data connectedness, semantic context, and inference.
Want to learn more about natural language querying?