KInIT and The University of Sheffield Proposed Winning Solutions at the Prestigious SemEval 2023 Data Challenge

As a part of the project, the Kempelen Institute of Intelligent Technologies and the University of Sheffield co-participated at the well-recognized SemEval 2023 data challenge – the most prestigious data challenge within the domain of Natural Language Processing (NLP). 

The challenge consisted of 12 different tasks across a large range of application domains and problems. As part of this data competition, we have specifically focused on the task of predicting the credibility of text contained in articles found on the Web (Task no. 3). 

This task is closely related to the research within the project, specifically to research on prediction of online content credibility. KInIT has primarily focused on the persuasion technique detection subtask. University of Sheffield has primarily focused on remaining two subtasks:

  • news genre categorization (whether the provided news article is an opinion piece, satire or objective reporting);
  • framing detection (what topic the article addressed, such as health and safety, economics, etc.).

In the persuasion technique detection subtask, the goal is to identify all the persuasion techniques in a written paragraph of text automatically. The persuasion techniques are used to influence someone’s beliefs, attitudes, or behaviors, in order to convince them to adopt a particular point of view or take a specific action. As such, they are often used in marketing, politics and other interpersonal communication. 

Although persuasion techniques can be used in positive context, their most significant negative impact comes from their use in misinformation campaigns. In the misinformation context, persuasion techniques are often used to influence public opinion about a specific topic and persuade them to adopt a specific, often harmful, view, such as magnifying the negative impact of vaccination or justifying hostile behavior.

Therefore, automatically detecting persuasion techniques represents an important application of AI when addressing misinformation. It can be utilized as a credibility signal to assess content credibility and improve misinformation detection. This task is challenging due to multiple reasons. 

First of all, the given paragraphs can contain multiple persuasion techniques at the same time, or even none at all (which is often the case). This also results in significant imbalance in the data, where some persuasion techniques appear only rarely. 

Secondly, manual identification of persuasion techniques is difficult and time consuming. It can be done by experts that understand all the nuances in discourse and discussion. At the same time, it can take hours for a single expert to identify all the persuasion techniques in the given news article. 

Finally, the task was formulated as a multilingual problem, with persuasion techniques distributed across 6 different languages (English, French, German, Italian, Polish and Russian) and 3 surprise languages (Greek, Georgian and Spanish) that were revealed only for testing purposes. Therefore, the proposed AI solution needed to deal with all languages at the same time, but also generalize to languages for which it has never seen any data.

KInIT team, consisting of interns Timo Hromádka, Timotej Smoleň and Tomás Remiš, under the supervision of Branislav Pecher and Ivan Srba, has proposed a solution that achieved great results

Using the proposed AI solution, we ranked 1st in 6 out of 9 languages (Italian, Russian, German, Polish, Greek and Georgian), two of which were the previously unseen languages. 

For the remaining languages, we placed 2nd in Spanish (the last unseen language), 3rd for French and 4th for English. These amazing results were complemented by the team from University of Sheffield which achieved 2nd place for English.

As part of the solution, the KInIT team tried different methods for dealing with the multilinguality. First approach that was explored was a simple monolingual model that worked only in English. All the other languages were translated into English using translation services (Google Translate). The second approach that was explored was the use of a fully multilingual model that can understand and work with all the languages. 

In both of these approaches, the KInIT team utilized a pretrained large language model (RoBERTa/XLM-RoBERTa) that were fine-tuned for the specific task of persuasion technique detection. From the comparison, the multilingual solution proved to be superior. Therefore, it was used for the final solution (for more information see the research paper to be published at SemEval workshop). 

University of Sheffield team was one of 5 teams that attempted all three subtasks for all languages within the shared task, particularly focusing on two remaining subtasks: detecting the news genre and framing. For Subtask 1 (Genre), the Sheffield team achieved joint-first for German, and had the highest mean rank of multi-language teams. 

For Subtask 2 (Framing), we achieved first place in 3 languages, and the best average rank across all the languages. 

For Subtask 3 (Persuasion Techniques), the team  achieved top 10 for all languages, including 2nd for English. Our research paper was published at SemEval workshop proceedings and nominated for the Best Paper Award.

For Subtask 1, the team used an ensemble of fully trained and adapter mBERT models. For Subtask 2, two separate ensembles were applied: a monolingual RoBERTa-MUPPET LARGE and an ensemble of XLM-RoBERTa LARGE with adapters and task adaptive pretraining. For Subtask 3, we trained a monolingual RoBERTa-Base model for English and a multilingual mBERT model for the remaining languages. For each subtask, we compared monolingual and multilingual approaches, and considered class imbalance techniques.

The image showing the amazing results achieved by The Kempelen Institute of Intelligent Technologies (team name: KInITVeraAI) and The University of Sheffield (team name: SheffieldVeraAI)

All in all, both teams together were very successful in this complex task of credibility detection, which is highly relevant for the project. Motivated by positive results, we continue with other experiments (including the experimentation with the state-of-the-art large language models, such as ChatGPT or GPT4). 

We expect that new findings will lead to additional interesting research outcomes as well as to deployment of the best-performing models as a part of credibility assessment service, which we plan to deliver in the project. Such service will help media professionals to automatically and effectively obtain credibility analysis of various online content (for example, to pre-screen whether the content is not credible and should be fact-checked or quite opposite, the content is credible and can be cited).

Olesya Razuvayevskaya from The University of Sheffield co-authored this article.