Administration :: Configuration :: CORE Module

CORE Module

KIM version [1.6.2.23] officially introduces the CORE module of KIM Server. CORE stands for Co-Occurrence and Ranking of Entities and the power it adds to KIM is timeline analysis and a novel incremental search based on both full text search and co-occurrance of entities. The essence of the approach is a specific indexing, performed on the basis of semantic annotation of text with respect to named entities and key-phrases. These entities form a reduced dimension feature space, where the documents are characterized by the occurrence of the entities. The frequency of occurrence of an entity in a document indicates its level of association with it, which can also be considered as “popularity”. The frequency of co-occurrence of entities is evidence of the existence and strength of an associative relationship between them, although the exact relation type might not be known and this can be used for rich metadata extraction. The dates of the documents are used to provide temporal extent to the context space and allow for analysis of trends of popularity or of association.

CoreDbAPI and TimelinesAPI are now available for doing both tasks - inscremental search and timeline analysis. The first one also provides all the basic routines for storing and retrieving documents, entities, aliases, etc. from/to the database. Although the APIs themselves allow for multiple CORE implementations, there is only one available at the moment and it is based on Oracle since scale ans speed are main issues.

The KIM WebUI has two new sections called respectively CORE Search and Timelines that demostrate the potential of the CORE module for incremental search and for calculating trends. Both are being automatically enabled/disabled depending on whether there is an enabled and correctly configured CORE module. More configuration details are provided below.

Finaly, a tool is provided for the initial creation and population of tables and creating of indices (it can also be used to clear and pre-populate all the tables and re-create the indices). To do so, use bin/toolCoreDb.bat (for Windows) or ./bin/toolCoreDb.sh (for Linux). But before that make sure you have Sesame, KIM and the database configured properly and running. It is a console tool and status messages will be printed as different actions are being performed. Note that the tool will re-create and clear all existing tables and will populate all the entities currently hold by Sesame but not any documents. It will also create but will not update the FTS (full text search) indices as this is done automatically by the CORE module in the process of inserting documents through the API provided.

Configure : Administration :: Configuration :: CORE Module