How to find parameters for supervised learning?
In the week from 20th to 24th June 2022, AI enthusiasts in Bratislava had two opportunities to meet Professor Peter Richtárik, a renowned expert in machine learning. At the Better_AI Meetup and also at a weeklong course at the Faculty of Mathematics, Physics and Informatics (FMFI UK).
Professor Peter Richtárik from KAUST is one of our external mentors. He organized a special course on this topic at his Alma Mater, the FMFI in Bratislava. The course was called “Introduction to Stochastic Gradient Descent Methods” and some of our colleagues had the opportunity to deep dive into the subject.
One of the subcategories of machine learning is supervised machine learning, where parameters of models are learned using datasets that contain labels for each observation. This learning is done by gradual change of the model’s parameters in a way so that the difference between predicted and ground truth outcome is getting as small as possible. However, modern supervised machine learning models have millions of parameters, so it is important, but very complicated, to find the parameters as quickly and as precisely as possible.
Stochastic Gradient Descent or its variations are nowadays the workhorse methods in training supervised machine learning. However this area got wide during the last ten years and it is still expanding, which makes it hard for practitioners to understand its landscape.
This course was a mathematical introduction to the latest results and insights of this field. The goal was to understand the differences between key variants of Stochastic Gradients Descent and their convergence and complexity.
The theory uncovered interesting relationships between methods and offered explanations why in practice some methods work better than others for a specific problem. Also the mathematics behind optimization was insightful to people primarily working in other fields of AI, for example meta-learning as the optimization is a fundamental part of one of the directions of meta-learning.
As Professor Peter Richtarik is one of the best researchers in this field, it was an unique opportunity for us to take part. We are very happy that we had the chance to learn something new, discuss and get inspired not just in the area of stochastic gradient descent methods but also in the up-to-date research in machine learning and artificial intelligence in general.