CIMA: Intelligent matching and linking of organization data from different sources

  • Completed
  • Programme: Other
  • Start date: 25.05.2018
  • End date: 24.05.2020

innovation and competitiveness logos

CIMA (Intelligent matching and linking of organization data from different sources) is funded by the European Union’s European Regional Development Fund through Operational Programme „Innovations and Competitiveness 2014-2020” call “Intelligent Specialization”.

European Regional Development Fund: 662 664.33 BGN
National funding: 116 940.79 BGN
Total budget: 1 520 570.24 BGN

CIMA aims to use Artificial Intelligence (AI) technologies for linking and harmonizing company data from various sources. The project applies machine learning, semantic modeling and data integration as well as logical inference and validation to make company and related data (persons, locations, industry taxonomies, technology fields) better harmonized, integrated, interlinked and easier to use.

Contact: Vladimir Alexiev

Project Overview

Company and company-related economic information are crucial to many business operations. It empowers customer relationship management, acquisition of new clients, marketing campaigns, supply chain management, market analysis, competitive intelligence, mergers and acquisitions, etc. At present, there are approximately 300 million legal entities in the world. While company datasets can be acquired from multiple sources, no single one of them provides sufficient coverage and depth of information needed for comprehensive market or business intelligence.

To build such, one has to integrate data scattered across various datasets with various data structures and access methods. Currently, company data can be found in the official company registers of hundreds of jurisdictions, in open sources (such as DBpedia, Wikidata, Panama Papers), in semi-commercial sources (such as OpenCorporates), in commercial sources (such as Dun & Bradstreet, Bureau van Dijk, Factset, S&P Capital IQ), in investment-oriented databases (such as Crunchbase and CBI), etc. Often, the necessary information is only available in news and other textual sources, for example, about company directors or financial results.

The goal of the CIMA project is to harmonize data through semantic representation and integration, and to develop methods for semantic matching, linking and entity extraction for the chosen domain. It also includes preliminary research for the creation of an environment for data hosting and consumption.

Ontotext’s Role

In CIMA, Ontotext builds on top of its extensive expertise in cognitive analysis of databases and information. Years of experience in this area have led to the development of the Ontotext Platform. The platform facilitates the integration of structured information and text into huge Knowledge Graphs that include information about billions of concepts and the relationships between them.

CIMA strengthens Ontotext’s business line on global market intelligence products, services and solutions, which have the highest growth potential. Ontotext already has delivered such solutions for some of the world’s biggest and most reputable business information agencies, rating agencies and M&A consultants. These solutions combine open data and data from most of the global vendors of company data.

The aim of the CIMA project is to invest in R&D that makes delivery of such solutions much more efficient in the future. The project assists the process of adding new functionalities to Ontotext’s leading products: Ontotext Platform and Ontotext Cognitive Cloud. This boosts the company’s competitiveness and expands the market for these products.

See the Bulgarian version.

Ontotext Newsletter