Home
News
On the Effects of Randomness on Stability of Learning with Limited Labelled Data

What's
new

Author

Branislav Pecher

Dec 12. 2023

On the Effects of Randomness on Stability of Learning with Limited Labelled Data

Have you ever tried to replicate the results of a specific machine learning study, but often found different performance numbers and findings to the ones observed in the official study? Or have you ever tried to determine which model can be considered state-of-the-art for a specific task, but found that many studies report contradictory findings in this regard? Maybe you tried the newest method that should lead to a significantly better performance, but only found that it actually underperforms a simple baseline?

We recently published a survey paper as a preprint which aims to inform researchers and practitioners utilising learning with limited labelled data about the consequences of unaddressed randomness and how to effectively prevent and deal with them.

A common culprit that significantly contributes to all of these problems is uncontrolled randomness in the training process. Especially the approaches for dealing with limited labelled data (but to a certain extent also neural networks in general), such as in-context learning, transfer learning or meta-learning, were identified to be sensitive to the effects of uncontrolled randomness.

Take for example in-context learning, where such a simple thing as changing the order in which the in-context samples are presented to the model can determine whether we get state-of-the-art predictions or random guessing. Similarly, repeating fine-tuning with different initialisation can lead in the setting of limited data to large deviation in performance, where in some cases the smallest BERT variants can outperform their larger counterparts.

This uncontrolled randomness, if not properly addressed, was identified to lead to negative consequences, such as:

prohibiting objective comparisons between different models

creating an imaginary perception of research progress (due to unintentional cherry-picking)

making the research unreproducible

However, even though the effects of randomness can have significant impact, the focus on addressing them is limited in its extent, mainly when dealing with a limited number of labels.

In our new paper, we provide a comprehensive survey of papers that address the effects of randomness. First, we provide an overview of all the possible sources of randomness in the training (e.g., randomness factors), such as initialisation, data choice or data order, that may lead to lower stability of the learned models.

Second, we focus on all tasks for addressing the effects of randomness:

investigation of the impact of different factors is determined across different approaches for learning strategies;

determining the underlying origin of the randomness, such the problem of underspecification;

and finally the mitigation of the effects, where the impact is reduced, increasing stability without reducing the overall performance of the models.

Finally, we provide aggregate findings of our analysis of the different papers, based on which we identify 7 open problems that provide future directions in this field.

The purpose of this survey is to emphasise the importance of the research area, as it has so far not received adequate attention. First, it should serve researchers in this field to support their research. At the same time its purpose is also to inform researchers and practitioners utilising learning with limited labelled data about the consequences of unaddressed randomness and how to effectively prevent and deal with them. The survey paper, which we plan to continuously update along with its supplementary material, is available as a preprint here.

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.

Necessary

Always Enabled

Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.