RobIndAI: Robust indication of AI-generated disinformation content in multilingual online space

The RobIndAI project will fight against the misuse of AI for disinformation text generation by increasing the robustness of methods for machine-generated text detection. RobIndAI is focused on multilingual content, especially languages of the Central European region, targeting news articles and social-media content. RobIndAI is conceived as an extension of the VIGILANT project (Horizon Europe), in which KInIT participates.

The goal of the RobIndAI project is to research artificial intelligence methods and models for increasing the robustness of indicating disinformation content (from the web and social media), especially focused on the detection of machine-generated texts. Since the modern language models are capable of generating high-quality multilingual text, indistinguishable by a human, the concerns about misuse of such technology are growing (e.g., international disinformation campaigns). The reliable detection of AI-generated text, differentiating it from authentic human-written content, is a key required indicator.

In the RobIndAI project, we will use fundamentally multilingual methods and models for text processing, specifically tailored for the Central European information space. Within the project, we will create a benchmark focused on this region, comparing the performance of existing methods for AI-generated text detection. This benchmark study will also focus on the robustness of such methods against the existing attacks and obfuscation methods to avoid detection. Compared to the ongoing Horizon Europe project VIGILANT, RobIndAI will introduce more advanced text processing methods (primarily based on the latest large language models), regional and content-domain specificity of methods (along with a new dataset focused on our region), a deeper comparison of various architectural alternatives for detection (a dedicated model for each language vs. a single multilingual model), and robustness against new sophisticated attacks.

The project is based on the assumption that AI-generated texts have characteristic patterns, which can be identified by analytic methods and artificial intelligence itself. Regarding disinformation, the project considers machine-generated text to be a positive indicator of mass-spread disinformation in online space.

RobIndAI uses modern machine learning methods, natural language processing, and data analysis to address the problem of detecting machine-generated text in online media. A key factor is the acquisition of high-quality training data and a diverse dataset (augmented by paraphrased texts) to ensure the effectiveness of the models in the real world.

Project team

Jakub Šimko
Lead and Researcher
Dominik Macko
Researcher
Jakub Kopál
Research Engineer
Michal Spiegel
Research Intern
Adam Škurla
Research Intern
Katarína Házyová
Project Administrator
Marianna Palková
Communications Specialist
Adrián Gavorník
Ethics Specialist
Samuel Budai
Research Intern

Funded by the EU NextGenerationEU through the Recovery and Resilience Plan for Slovakia under the project No. 09I01-03-V04-00059.

Related Publications

  • Macko, D., Moro, R., & Srba, I. (2025). Increasing the Robustness of the Fine-tuned Multilingual Machine-Generated Text Detectors. arXiv preprint arXiv:2503.15128.
  • Macko, D., Ramakrishnan, A. A., Lucas, J. S., Moro, R., Srba, I., Uchendu, A., & Lee, D. (2025). Beyond speculation: Measuring the growing presence of LLM-generated texts in multilingual disinformation. arXiv preprint arXiv:2503.23242.