Research Group

Web & User Data Processing

Na tejto stránke

Bio
PhD topic
Selected achievements
Projects
Publications

Ján Čegiň

Research areas: natural language processing, crowdsourcing, human-computer interaction, human-inspired large language models interaction

Position: PhD Student

Email
Google Scholar
Researcher Gate
ORCiD
ResearcherID
LinkedIn

Ján focuses on natural language processing, machine learning, crowdsourcing and bridging human-computer interaction with large language models.

He holds a Masters’ degree from the Slovak University of Technology in Intelligent Software Systems. He graduated with honors (cum laude).

He has participated in national research projects in co-operation with industry leaders such as ESET (malware detection) and Continental Automotive Systems (test data generation for the MC/DC criterion).

PhD topic: Machine learning with human in the loop

Supervising team: Jakub Šimko (KInIT), Peter Brusilovsky (University of Pittsburgh)

Adversarial attacks are used to exploit weaknesses of machine learning models, leading to bad classification in many sensitive domains like false information detection, hate speech detection, malware detection, etc. Adversarial training has received much attention as a method to defend against these attacks, where adversarial attacks are incorporated into the training process, resulting in more robust models. One of the key challenges in adversarial training is the acquisition of adversarial examples. These challenges are generally created by perturbing the original example until a label switch occurs. However, these examples should be diverse, as the lack of diversity in the collected adversarial examples may lead to the enhancement of spurious correlations during adversarial training. As adversarial training can be considered for multiple modalities, in this work we focus on natural language processing. Two main methods exist to collect adversarial examples:

(1) automatic methods that use heuristics or machine learning methods to generate adversarial examples and

(2) crowdsourcing methods where human workers are tasked with creating adversarial examples.

While automatic methods achieve state-of-the-art performance, they are often domain-specific and the generated examples may have altered semantic meaning, leading to considerable information loss. Crowdsourcing methods are expensive and resource consuming, but the collected adversarial examples are more diverse and of higher quality. Given the comparison of the two previous methods, we focus our attention on crowdsourcing approaches to collect adversarial examples for domains where the preservation of meaning is crucial, such as false information detection in NLP.

However, in such approaches workers are left guessing about the correctness of their contributions in terms of diversity and no direct feedback is provided in this area. Diversity based incentives make it also increasingly harder for the workers to complete their tasks successfully, leading to increased abandonment rate. Given these drawbacks, our goal in this thesis proposal is to:

(1) incentivize workers to produce more diverse adversarial text examples by providing explicit explanation and visualization and

(2) increase the number of collected valid adversarial examples by using guidance methods for the workers.

We formulate our research questions based on these goals and present the planned experiments that are aimed towards answering these questions.