ESET: Machine learning for malware clustering (industry PhD)
With our industry partner, ESET, we created an industrial PhD student position at KInIT to help push the state-of-the-art in the malware domain even further, specifically in malware clustering.
Besides malware detection, ESET also performs binary file clustering to get accurate segment information about hundreds of thousands of newly created files every day. This is not a trivial task – as the lines between similar and dissimilar software can be oftentimes very small and can also change with time.
Our main goal is to contribute to malware clustering research by improving the performance of malware clustering on large numbers of samples mainly in terms of clustering quality. We focus on creating representations of malware that are adapted for the sole purpose of malware clustering, thereby having a chance to improve the precision of clustering methods.
There are three main areas of research we focus on – improving malware clustering using an end-to-end clustering framework, utilizing neural networks or other intelligent approaches to create the best representations for clustering, and improving the way of assessing the quality of clustering, where currently we mainly rely on the feedback from domain experts.
Advances in the field of machine learning have kicked off a completely new era. An era where almost any piece of data collected is processed and analyzed via algorithms that depend on machine learning technology – cybersecurity included.
Juraj Janosik, Leader of AI/ML section & Filip Mazan, Senior Software Engineer Team Lead
ESET
Project team
Jakub Ševcech
Research Consultant 10/2020-02/2023