vera.ai seeks to build professional trustworthy Artificial Intelligence (AI) solutions against advanced disinformation techniques, co-created with and for media professionals and researchers and to also set the foundation for future research in the area of AI against disinformation.
Contact: Andrey Tagarev
Online disinformation and fake media content have emerged as a serious threat to democracy, economy and society. Recent advances in AI have enabled the creation of highly realistic synthetic content and its artificial amplification through AI-powered bot networks. Consequently, it is extremely challenging for researchers and media professionals to assess the veracity of online content and to uncover the increasingly more complex disinformation campaigns.
The aim of vera.ai is to develop and build trustworthy AI solutions to be used by the widest possible community, including journalists, fact-checkers, investigators, researchers and other professionals, in the fight against disinformation. The solutions developed will deal with different content types (audio, video, images, and text) and do so across a variety of languages.
vera.ai adopts a multidisciplinary co-creation approach to AI technology design, coupled with open source algorithms. A unique key proposition is grounding of the AI models on continuously collected fact-checking data gathered from the tens of thousands of instances of “real life” content being verified in the InVID-WeVerify plugin and the Truly Media/EDMO platform. Social media and web content will be analyzed and contextualized to expose disinformation campaigns and measure their impact.
Key novel characteristics of the AI models will be fairness, transparency (including explainability), robustness against concept drifts, continuous adaptation to disinformation evolution through a fact-checker-in-the-loop approach, and ability to handle multimodal and multilingual content. Recognising the perils of AI generated content, the project will develop tools for deepfake detection in all formats (audio, video, image, text).
Results will be validated by professional journalists and fact checkers from project partners (DW, AFP, EUDL, EBU), external participants (through our affiliation with EDMO and seven EDMO Hubs), the community of more than 70,000 users of the InVID-WeVerify verification plugin, and by media literacy, human rights and emergency response organizations.
Sirma AI, trading as Ontotext, will continue contributing to the fight against disinformation through further development of the database of debunked content, created within the WeVerify project. This is a graph database that records debunked content, so professional users can establish easily whether a new image, video, or claim has already been verified, by whom, when, and how. The database is populated automatically with fact-checks from trustworthy IFCN sources.
Content is enriched using natural language processing techniques so it is possible to distinguish between the false content being debunked and evidence from reliable sources cited in the debunk as well as to enable cross-lingual and multimodal near-duplicate search. In the course of the project, Ontotext will expand the multilingual coverage of the database with content from Central and Eastern European fact-checkers as well as include debunks in all other EU languages. New functionalities will be developed for professionals to subscribe to DBKF updates and recommendations of frequently debunked content.
Leveraging on its AI and knowledge graph expertise, Ontotext will also work on the creation of a semantic model of disinformation campaigns and narratives, as a basis to provide insights about major actors, spread patterns and impact of disinformation. To identify a narrative or disinformation campaign, the company will start from detecting and connecting separate cases of false information based on their multi-lingual and multi-modal similarity. The main actors, intents, geography and other relevant characteristics will be (semi-)automatically extracted from content of debunks in the database of debunked content. The semantic model will also serve as a basis for publishing the outputs of research results on the topic as FAIR data.
This project has received funding from the European Union’s Horizon Europe research and innovation programme under grant agreement No:101070093. Views and opinions expressed are however those of the author only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible for them.