Research Group

Web & User Data Processing

On this page

Bio
PhD topic
Projects
Publications

Branislav Pecher

Research areas: machine learning deep learning interpretability and explainability few-shot learning meta-learning misinformation detection learning with limited labelled data

Position: Researcher

Email
Google Scholar
Research Gate
ORCiD
ResearcherID
GitHub
LinkedIn

Branislav is a PhD student focusing on misinformation detection and related phenomena, as well as on learning models that involve only a limited number of annotated samples, particularly meta-learning. He is also interested in the interpretability and explainability of various machine learning models, especially neural networks.

He holds a Masters’ degree in Intelligent Software Systems from the Slovak University of Technology. During his studies, he received multiple Rector’s awards for excellent study performance and graduated magna cum laude. He has participated in both national and international research projects and is a former member of a PeWe (Personalized Web) research group.

PhD topic: Addressing effects of randomness on the stability of learning with limited labelled data

Supervising team: Mária Bieliková (KInIT), Ivan Srba (KInIT)

Learning with limited labelled data, including but not limited to concepts such as zero/few-shot learning, transfer learning or meta-learning, aims to effectively train a model while using only a small amount of labelled samples. Although such approaches were successfully applied in many domains, they are especially sensitive to the effects of uncontrolled randomness caused by non-determinism in the training process. The non-determinism negatively affects the stability of the trained models and leads to large variance in results across training runs. Such uncontrolled randomness can unintentionally, but unfortunately also intentionally, create an imaginary perception of research progress. When disregarded, the randomness can prohibit objective comparison of new methods, leading to overestimation, or in some cases underestimation, of their performance. Even though a growing number of studies focus on addressing the negative effects of randomness on stability, the research is still limited in its extent, facing many problems and inconsistencies.

Our aim in this work is to deal with the drawbacks when addressing the effects of randomness in learning with limited labelled data. Based on a comprehensive survey, we identify the most significant open problems, specifically the focus on investigation only on a small fragment of randomness, disregard for the interactions between sources of randomness and an absence of effective mitigation strategies. As a part of preliminary experiments, we design a general methodology for investigating effects of randomness. We apply this investigation methodology to determine the importance of multiple sources of randomness in meta-learning and language model fine-tuning. The preliminary results revealed a significant impact of randomness originating from splitting of data and selecting what data is labelled, which are mostly ignored in current research.

Selected achievements

Placed first in 6 out of 9 languages within SemEval Task 3 Subtask data challenge focused on detection of persuasion techniques (team name KInIT). Placed 2nd, 3rd and 4th in the remaining languages, becoming the most successful team. The task that was addressed is a complex multi-class multi-label classification with 23 classes and with 3 surprise languages (in a zero-shot setting).

Best Paper award at RecSys 2021 conference (the awarded full paper)

Rector’s Award for Excellent Study Results during Masters’ Programme, 2019, Slovak University of Technology

Rector’s Award for the Best Student of Master Study for the Academic Years 2017/18, 2018/19, Slovak University of Technology

Selected activities

Attended IJCAI-ECAI’22 conference and took part in the Doctoral Consortium with paper “Transferability and Stability of Learning with Limited Labelled Data in Multilingual Text Document Classification”

Participated in the Eastern European Machine Learning Summer School 2022

Attended different seminars and lecture series (“Introduction to Stochastic Gradient Descent Methods” by Peter Richtarik)

Organised multiple seminars on Learning with Limited Labelled Data methods for the members of KInIT, serving as overview of possible approaches

Attended RecSys’21 conference and presented the paper “An Audit of Misinformation Filter Bubbles on YouTube: Bubble Bursting and Recent Behavior Changes”, the paper won Best Paper Award

Volunteering activities at the IJCAI-ECAI’22 and RecSys’21 conferences

Attended ECML-PKDD’20 conference and presented the paper “FireAnt: claim-based medical misinformation detection and monitoring”

Participated at the PhD Forum at the ECML-PKDD’20 conference with unpublished paper “Learning to Detect Misinformation Using Meta-Learning”

Selected Projects

CEDMO: Central European Digital Media Observatory

Dec 7. 2021

As EDMO sets the frame for Europe, the aim of CEDMO is to implement an unprecedented, but highly experienced hub against disinformation for the Czech Republic, Poland and Slovakia. Its…

vera.ai: VERification Assisted by Artificial Intelligence

Jul 11. 2022

vera.ai is a Horizon Europe project aimed at basic and applied research of AI methods to fight false information. The vera.ai project focuses on textual, multilingual and multimodal content, and…

Selected Publications

A Survey on Stability of Learning with Limited Labelled Data and its Sensitivity to the Effects of Randomness

Pecher, B., Srba, I., Bielikova, M. – ACM Computing Surveys (ACM CSUR), August 2024

Pecher, B., Srba, I., Bielikova, M. Learning with limited labelled data, such as prompting, in-context learning, fine-tuning, meta-learning or few-shot learning, aims to effectively train a model using only a…

Download

On Sensitivity of Learning with Limited Labelled Data to the Effects of Randomness: Impact of Interactions and Systematic Choices.

Pecher, B., Srba, I., Bielikova, M. – Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing,

Automated disinformation generation is often listed as one of the risks of large language models (LLMs). The theoretical ability to flood the information space with disinformation content might have dramatic…

Download

Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation

Pecher, B., Cegin, J., Belanec, R., Simko, J., Srba, I., Bielikova, M. – Findings of the Association for Computational Linguistics: EMNLP 2024,

While fine-tuning of pre-trained language models generally helps to overcome the lack of labelled training samples, it also displays model performance instability. This instability mainly originates from randomness in initialisation…

Download

Effects of diversity incentives on sample diversity and downstream model performance in LLM-based text augmentation

Cegin, J., Pecher, B., Simko, J., Srba, I., Bielikova, M., Brusilovsky, P. – Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) – ACL 2024,

Cegin, J., Pecher, B., Simko, J., Srba, I., Bielikova, M., and Brusilovsky, P.1 1 University of Pittsburgh, Pittsburgh, USA The latest generative large language models (LLMs) have found their application…

Download

Towards Continuous Automatic Audits of Social Media Adaptive Behavior and its Role in Misinformation Spreading

Simko, J., Tomlein, M., Pecher, B, Moro, R., Srba, I., Stefancova, E., Hrckova, A., Kompan, M., Podrouzek, J., Bielikova, M. – Adjunct Proceedings of the 29th ACM Conference on User Modeling, Adaptation and Personalization – UMAP 2021,

Download

Auditing YouTube’s Recommendation Algorithm for Misinformation Filter Bubbles

Srba, I., Moro, R., Tomlein, M., Pecher, B., Simko, J., Stefancova, E., Kompan, M., Hrckova, A., Podrouzek, J., Gavornik, A., Bielikova, M. – ACM Transactions on Recommender Systems (ACM TORS), 2023

Download

An Audit of Misinformation Filter Bubbles on YouTube: Bubble Bursting and Recent Behavior Changes

Tomlein, M., Pecher, B., Simko, J., Srba, I., Moro, R., Stefancova, E., Kompan, M., Hrckova, A., Podrouzek, J., Bielikova, M. – RecSys ’21: Fifteenth ACM Conference on Recommender Systems, 28 June 2021

Download

FireAnt: Claim-based Medical Misinformation Detection and Monitoring

Pecher, B., Srba, I., Moro, R., Tomlein, M., Bielikova, M. – European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Springer, 2020

Download

Transferability and Stability of Learning with Limited Labelled Data in Multilingual Text Document Classification

Pecher, B. – Thirty-First International Joint Conference on Artificial Intelligence Doctoral Consortium,

Download

Web & User Data Processing

Branislav Pecher

PhD topic: Addressing effects of randomness on the stability of learning with limited labelled data

Selected achievements

Selected activities

Selected Projects

CEDMO: Central European Digital Media Observatory

vera.ai: VERification Assisted by Artificial Intelligence

Selected Publications

A Survey on Stability of Learning with Limited Labelled Data and its Sensitivity to the Effects of Randomness

On Sensitivity of Learning with Limited Labelled Data to the Effects of Randomness: Impact of Interactions and Systematic Choices.

Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation

Effects of diversity incentives on sample diversity and downstream model performance in LLM-based text augmentation

Towards Continuous Automatic Audits of Social Media Adaptive Behavior and its Role in Misinformation Spreading

Auditing YouTube’s Recommendation Algorithm for Misinformation Filter Bubbles

An Audit of Misinformation Filter Bubbles on YouTube: Bubble Bursting and Recent Behavior Changes

FireAnt: Claim-based Medical Misinformation Detection and Monitoring

Transferability and Stability of Learning with Limited Labelled Data in Multilingual Text Document Classification

Why partner with KInIT