Ontotext Platform 3.4 Brings Better Search and Aggregation in Knowledge Graphs

The new version of the Platform introduces a Semantic Search Service that enables engineering teams to declaratively configure search and aggregation views over knowledge graphs and consume the data using the GraphQL interface

Sofia/Bulgaria Tuesday, April 13, 2021

The heart of Ontotext Platform 3 is the declarative approach for access and management of large-scale knowledge graphs (KG). This allows engineering teams to define specific GraphQL interfaces to read and write data over parts of a knowledge graph and let the Platform implement an efficient translation of GraphQL to SPARQL.

Ontotext Platform 3.4 combines the power of GraphDB, Elasticsearch and GraphQL by enabling the definition, automatic synchronization and querying of indices to boost the performance of specific queries. The Workbench front-end tool of the Platform features a new generic search interface for KG exploration and navigation. The new version of the Semantic Object service delivers up to 10 times better performance to execute big and data intensive GraphQL queries on top of GraphDB.

Semantic Search Service

Ontotext Platform now extends its capabilities with a major new component: a Semantic Search Service. This service is declaratively configurable, easily accessible via generated GraphQL API and transpiles GraphQL to Elasticsearch queries. It enables software engineers to easily accomplish some of the capabilities over a knowledge graph that are most required by SMEs such as Full-text Search (FTS), Auto-complete/typeahead (related concepts and controlled vocabulary), Auto-suggest (related keywords and phrases), Faceted search, complex dashboards using different statistical and/or bucket aggregations, etc.

To achieve thеsе capabilities and more, Ontotext Platform extends the Semantic Object Modeling Language by providing additional options to configure the Semantic Objects. Now, software engineers can declare which Semantic Objects must have indices, apply a filter if necessary, configure the search shape, how far (deep) for each relation of the node the data should be indexed, what search analyzers should be used, etc.

These capabilities provide great flexibility but can be overwhelming, especially if the user is not familiar with all possible Platform or Elasticsearch configurations. Therefore, each of the configurations has default values, which enable a novice user to just mark the object types needed for their search and the rest comes out of the box.

Based on the declarative configuration, Ontotext Platform will create the required indices in Elasticsearch; it will generate a GraphQL schema and endpoint supporting the declared objects; it will configure the GraphDB Elasticsearch connector and trigger the load of the knowledge graph data into the respective indices. Consequently, software engineers will be able to use a well-defined and rich GraphQL endpoint providing a GraphQL representation of the Semantic Search objects and a large amount of Elasticsearch features, which follow the Elasticsearch query syntax as closely as possible.

As a result of our implementation, software engineers can use all of the most widespread Elasticsearch queries, aggregations and features such as:

  • Queries – terms, boolean, match, wildcard, phrase queries, etc.
  • Aggregations
    • Bucket – terms, filters, range, nested, etc.
    • Statistical – sum, min, max, avg, count, cardinality, etc.
  • Sort, paginate, boost, fuzzy,, etc.

The provided GraphQL endpoint will enable users not only to search in the data but also to retrieve the data for the result list directly from Elasticsearch. In addition, if the GraphQL Federation is enabled, engineers can request, in the same query, information from the Semantic Object service or other federated services and build a more comprehensive response they may need.

Another major benefit provided by the Platform is the automatic synchronization between Elasticsearch and GraphDB write operations. This ensures that all write operations in GraphDB, whether triggered by GraphQL mutation or SPARQL, are automatically synchronized to the Elasticsearch indices based on the declared model.

All components of the new service are dockerized and available for Kubernetes, manageable with predefined Helm charts.

To summarize, the new Ontotext Platform’s Semantic Search Service features many advantages over GraphDB’s Elasticsearch connector. More specifically, the service provides:

  • access to a large subset of Elasticsearch features over GraphQL whereas GraphDB supports limited features exposed through SPARQL;
  • automatically generated GraphDB Elasticsearch connectors based on the declarative Platform model;
  • a GraphQL endpoint that can be integrated with external services and automatic federation with the Semantic Object service;
  • the ability to return data directly from Elasticsearch whereas GraphDB can return only  limited data such as object ID and snippets;
  • the ability to execute complex aggregation over the knowledge graph representation in Elasticsearch;
  • the use of all keyword, numbers and date fields as facets without the need to specifically declare them as such in the connector.

Workbench Improvements

Ontotext Platform 3.4 introduces an auto-configurable search page that provides Full-text Search (FTS), Auto-complete (related concepts and controlled vocabulary) and Faceted search over the knowledge graph. The search page uses the declarative configurations and automatically generates proper queries to execute the above mentioned searches and apply facets based on the search result list and object properties configuration. In addition, the full shape and related data of an object from the search result list can be displayed, so the user can view the available information in all federated services.

With the introduction of the new Semantic Search Service and its new GraphQL endpoint, we’ve updated the Workbench GraphQL page, so the user can select the available endpoints and execute queries. These endpoints depend on the specific Ontotext Platform deployment and can include: Semantic Object service endpoint, Semantic Search Service endpoint and Federation Service endpoint.

The schema generation was improved to support several OWL cardinality options and to incorporate them in the resulting Platform schemas. This will align them better .to the ontology model and will optimize the generated SPARQL queries.

Performance and Memory Improvements

The new release significantly reduces the Ontotext Platform overhead in terms of query time and memory footprint. After a redesign of the SPARQL query processing, big and data-intensive queries have been improved and are now significantly faster and with lower processing memory requirements. With some additional optimizations, the overall memory footprint of the Platform is now reduced in half. There are improvements in handling GraphQL introspection queries for very big models. Also, an introspection query cache has been introduced to speed up the introspection queries response time and the Workbench time when working with big GraphQL schemas.

Last but not least, Ontotext Platform 3.4 has many other improvements such as better handling of GraphQL missing values for multi-valued properties, various bug fixes as well as addressing all resolved high level security vulnerabilities in the Platform dependencies.

Get ready to take advantage of all new Ontotext Platform 3.4 features!

New call-to-action

 

For more information, contact Doug Kimball, Chief Marketing Officer at Ontotext