DisTraceAI: Better determination of the presence of narratives in automatically analyzed content on the web
The DisTraceAI project will help to address the need to detect ongoing disinformation campaigns by exploring effective methods for detecting the presence of narratives in textual content. The project aims to enable this detection in selected languages with fewer available resources. DisTraceAI is conceived as an extension of the veraAI project (Horizon Europe), in which KInIT participates.
The goal of the DisTraceAI project is to research artificial intelligence methods and models for detecting disinformation and manipulative campaigns in online content (from the web and social media). The methods and models will focus on text processing, be fundamentally multilingual, and primarily tailored to the needs of the Central European information space. Since the detection of campaigns requires the prior detection of narratives and claims present in the content, DisTraceAI dedicates part of its capacity to the research of these methods and models.
Additionally, we will explore possibilities for the early detection of disinformation and manipulative campaigns before they fully develop in the online space. Compared to the ongoing Horizon Europe project vera.ai, DisTraceAI will introduce more advanced text processing methods (primarily based on the latest large language models), regional and content-domain specificity of methods (along with a new dataset focused on our region), emphasis on real-time campaign detection, and robustness against new disinformation narratives.
The project is based on the assumption that disinformation and disinformation campaigns have characteristic patterns and connections that can be identified using analytical methods and artificial intelligence.
DisTraceAI uses modern machine learning methods, natural language processing, and data analysis to address the problem of identifying and detecting disinformation in online media. A key factor is the acquisition of high-quality training data and a diverse dataset to ensure the effectiveness of the models in the real world. Since we anticipate a lack of labels, we use limited labelled learning techniques such as transfer learning, meta-learning, semi-supervised learning, and weak labels.
Funded by the EU NextGenerationEU through the Recovery and Resilience Plan for Slovakia under the project No. 09I01-03-V04-00006.