Research areas: machine learning deep learning interpretability and explainability few-shot learning meta-learning misinformation detection learning with limited labelled data
Position: PhD Student
Branislav is a PhD student focusing on misinformation detection and related phenomena, as well as on learning models that involve only a limited number of annotated samples, particularly meta-learning. He is also interested in the interpretability and explainability of various machine learning models, especially neural networks.
He holds a Masters’ degree in Intelligent Software Systems from the Slovak University of Technology. During his studies, he received multiple Rector’s awards for excellent study performance and graduated magna cum laude. He has participated in both national and international research projects and is a former member of a PeWe (Personalized Web) research group.
PhD topic: Addressing effects of randomness on the stability of learning with limited labelled data
Supervising team: Mária Bieliková (KInIT), Ivan Srba (KInIT)
Learning with limited labelled data, including but not limited to concepts such as zero/few-shot learning, transfer learning or meta-learning, aims to effectively train a model while using only a small amount of labelled samples. Although such approaches were successfully applied in many domains, they are especially sensitive to the effects of uncontrolled randomness caused by non-determinism in the training process. The non-determinism negatively affects the stability of the trained models and leads to large variance in results across training runs. Such uncontrolled randomness can unintentionally, but unfortunately also intentionally, create an imaginary perception of research progress. When disregarded, the randomness can prohibit objective comparison of new methods, leading to overestimation, or in some cases underestimation, of their performance. Even though a growing number of studies focus on addressing the negative effects of randomness on stability, the research is still limited in its extent, facing many problems and inconsistencies.
Our aim in this work is to deal with the drawbacks when addressing the effects of randomness in learning with limited labelled data. Based on a comprehensive survey, we identify the most significant open problems, specifically the focus on investigation only on a small fragment of randomness, disregard for the interactions between sources of randomness and an absence of effective mitigation strategies. As a part of preliminary experiments, we design a general methodology for investigating effects of randomness. We apply this investigation methodology to determine the importance of multiple sources of randomness in meta-learning and language model fine-tuning. The preliminary results revealed a significant impact of randomness originating from splitting of data and selecting what data is labelled, which are mostly ignored in current research.
- Placed first in 6 out of 9 languages within SemEval Task 3 Subtask data challenge focused on detection of persuasion techniques (team name KInIT). Placed 2nd, 3rd and 4th in the remaining languages, becoming the most successful team. The task that was addressed is a complex multi-class multi-label classification with 23 classes and with 3 surprise languages (in a zero-shot setting).
- Best Paper award at RecSys 2021 conference (the awarded full paper)
- Rector’s Award for Excellent Study Results during Masters’ Programme, 2019, Slovak University of Technology
- Rector’s Award for the Best Student of Master Study for the Academic Years 2017/18, 2018/19, Slovak University of Technology
- Attended IJCAI-ECAI’22 conference and took part in the Doctoral Consortium with paper “Transferability and Stability of Learning with Limited Labelled Data in Multilingual Text Document Classification”
- Participated in the Eastern European Machine Learning Summer School 2022
- Attended different seminars and lecture series (“Introduction to Stochastic Gradient Descent Methods” by Peter Richtarik)
- Organised multiple seminars on Learning with Limited Labelled Data methods for the members of KInIT, serving as overview of possible approaches
- Attended RecSys’21 conference and presented the paper “An Audit of Misinformation Filter Bubbles on YouTube: Bubble Bursting and Recent Behavior Changes”, the paper won Best Paper Award
- Volunteering activities at the IJCAI-ECAI’22 and RecSys’21 conferences
- Attended ECML-PKDD’20 conference and presented the paper “FireAnt: claim-based medical misinformation detection and monitoring”
- Participated at the PhD Forum at the ECML-PKDD’20 conference with unpublished paper “Learning to Detect Misinformation Using Meta-Learning”