Ontotext’s GraphDB 8.7 Offers Vector-Based Concept Matching and Better Scalability, Performance and Data Governance

Friday, October 5, 2018

Ontotext is releasing GraphDB 8.7 – the latest version of its semantic graph database adding support for concept-matching in knowledge graphs thanks to a new plugin returning similar terms, documents and entities based on statistic semantics methods like Random Projection. This release also includes performance improvements that enable efficient query federation across repositories hosted in a single database instance and faster updates in big knowledge graphs. Active repositories can now be used as a source for backup, without downtime for reading operations even in non-cluster environments.

GraphDB release 8.7. features the Semantic Vectors package integrated as a plugin. It enriches the RDF graph with semantic similarity indices, based on a highly scalable vector space model. Similar to the full-text search connectors of GraphDB, developers and administrators can define various indices, which cover specific types of documents and entities, specific attributes and property paths.

For instance, in a single repository, there could be one index covering news, another one covering all sorts of entities, e.g., people, organizations, locations, and the third one for topics. These indices can be used to embed similarity searches as part of SPARQL queries to enable flexible combinations of graph pattern-based queries with reasoning, semantic text similarity, full-text search, geo-spatial constraints, etc. Unlike full-text search, one can search for similarity across all combinations of terms and documents: get terms similar to a given one, find similar entities or documents, search for a document by term or get terms most characteristic for a document.

Ontotext is continuously improving GraphDB, expanding its functionalities to serve various use cases. Thanks to this new plugin, users can get more results based on the matching of semantically close concepts – results that cannot be obtained via structured and full-text search queries, which require exact matching of words or identifiers. For example, when a new question is submitted to a help desk, users will be able to query and find similar questions in the database and how they have been answered in the past. In a news collection and processing scenario, similarity search can be used to interlink and group news about the same story from different sources

GraphDB has now a much faster protocol for internal federation across repositories in one and the same database instance. It offers several times faster query evaluation compared to a scenario where data across repositories is combined using non-optimized SPARQL federation. The actual speedup depends on the federation pattern – evaluations show 2x to 8x better performance across different queries in a scenario for news analytics using semantic metadata and big knowledge graphs, derived from Ontotext’s FactForge demonstration service.

This optimization makes feasible a wide range of scenarios related to better data governance. For instance, it enables the segmentation of the RDF graph into multiple repositories with different security settings and update procedures. It also enables multi-tenant knowledge-graph-as-a-service scenarios, where multiple proprietary repositories can be queried together with a big non-proprietary domain knowledge graph.

Another performance improvement supports dynamic management of big knowledge graphs. GraphDB 8.7 does faster commits, due to optimization, which minimizes the transaction overhead. The result is twice faster update rate for scenarios like the one in the LDBC Semantic Publishing Benchmark, where small transactions make frequent updates in a knowledge graph with 1 billion statements.

Last but not least, GraphDB 8.7 features improvements which make the operations of critical database instances easier and more reliable. GraphDB can now create a backup of an active repository by changing its status from read/write to read-only. This enables efficient backups without read downtime.

We have also upgraded GraphDB 8.7 to the latest versions of Elasticsearch v.6.3.x, Solr v.7.4.x and Lucene v.7.4.x and refactoring to use REST client instead of Transport client.

Get your GraphDB 8.7 now and start exploring the rich knowledge discovery capabilities of our leading semantic graph database.


For more information, contact Doug Kimball, Chief Marketing Officer at Ontotext