enRichMyData aims to create a revolutionary paradigm for building rich, useful and high-quality datasets for use in Big Data Analytics and AI applications. The paradigm involves simplification of the specification and execution of data enrichment pipelines on a large scale, with an emphasis on enabling different data enrichment processes including discovery, understanding, selection, big data cleansing, transformation and integration from many sources.
Contact: Nikola Tulechki
The main goal of the project is to develop the enRichMydata toolbox, comprising practical, robust and scalable components to support organizations in enriching their data with external reference data they may have limited knowledge of as well as supporting data providers in making their data reusable and available in data enrichment processes.
Due to a lack of appropriate tools and knowledge to support the cost-effective and energy-efficient management of data enrichment pipelines, a wide range of large and small organizations have difficulty delivering suitable data to feed their data analytics solutions. enRichMyData will make this paradigm easily accessible to these organizations. For building up, implementing, running, and maintaining data enrichment pipelines, enRichMyData will provide a software toolkit with tools and infrastructure services. EnRichMyData brings together five major corporations, three SMEs, two research institutions and three universities, which will join forces to develop new methods and tools for effective and efficient management of data enrichment processes.
The results of the project will be demonstrated in a wide range of business cases in areas such as Digital Marketing (marketing data enrichment for smart bidding optimization), Manufacturing (AI-based welding analytics), Predictive Maintenance (data enrichment for smart maintenance of medical imaging systems), Public Procurement (European register of entities from Known Actions), Innovation Ecosystems (Innovation Knowledge Graph), Mineral Processing (industrial data enrichment for mineral processing optimization).
Sirma AI, trading as Ontotext, is one of the key semantic partners in the project contributing to an array of components, encompassing not only the enRichMyData toolbox, but also the software infrastructure and communication.
The company will add several integral parts to the enRichMyData toolbox consisting of secure semantic storage and querying, data federation, semantic integration, semantic search and virtualization using a variety of different storages including, but not limited to RDF (GraphDB) and GraphQL (Ontotext Platform and GraphQL Federation). Ontotext’s 20+ years of experience in such technologies and solutions is invaluable for transforming the way data enrichment pipelines are created and used.
Ontotext is also a key partner in developing InnoGraph, a comprehensive knowledge graph of the AI and high tech innovation ecosystems, covering themes such as countries, universities, research groups, papers, startups and other companies, research and engineering topics, job market and investments, github projects and software toolkits, events and news. InnoGraph development will be led by the AI department of Institut “Jožef Stefan” (Ljubliana) and the created KG will be based on initial data and statistics from the Organisation for Economic Co-operation and Development AI Observatory.
Last but not least, Ontotext hopes to contribute to an industrial case by Bosch and to making a KG of public procurement authorities by Spend Network.
This project has received funding from the European Union’s Horizon Europe research and innovation programme under grant agreement No:101070284. Views and opinions expressed are however those of the author only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible for them.