At KInIT, my colleagues and I all do the same thing: we keep learning so we can leverage our knowledge for a better world.
Santiago Jose de Leon Martinez
Join our PhD program and become an expert in artificial intelligence. Dive into challenging research topics with world-class mentors and industry partners. We welcome curious minds to join our open culture.
Take advantage of an inspiring environment and become a top expert in your selected AI topic. At Kempelen institute, we prioritise ethical responsibility and societal benefit.
When working on your dissertation, you’ll have access to a distinguished supervising team, including industrial partners and/or renowned experts from world-class research institutions.
Choose a research topic that deeply interests you and will enrich the field. At KInIT, we support excellent, original world-class research and the connection with industry and current research and innovation challenges.
Our culture is based on trust, openness and respect. That is how great ideas are born. We share knowledge and innovations not only within the institute, but also with our industry partners and other curious people like you.
You’ll be teamed up with the best researchers in the region and gain invaluable experience and contacts from abroad that will boost your career growth in industry or academia.
You will become a full-time KInIT employee. We will give you all the time and space you need to focus on your research, skills development and growth.
Santiago Jose de Leon Martinez
Ivana Beňová
Supervising team: Michal Kompan (supervisor, KInIT), Peter Brusilovsky (University of Pittsburgh), Branislav Kveton (Google Research), Peter Dolog (Aalborg University),
Keywords: personalised recommendation, biases, machine learning, user model, fairness, off-policy
The recommender systems are an integral part of almost every modern Web application. Personalized, or at least adaptive, services have become a standard that is expected by the users in almost every domain (e.g., news, chatbots, social media, or search).
Obviously, personalization has a great impact on the everyday life of hundreds of million users across many domains and applications. This results in a major challenge – to propose methods that are not only accurate but also trustworthy and fair. Such a goal offers plenty of research opportunities in many directions:
There are several application domains where these research problems can be addressed, e.g., search, e-commerce, social networks, news, and many others.
Relevant publications:
The research will be performed at the Kempelen Institute of Intelligent Technologies (KInIT, https://kinit.sk) in Bratislava in cooperation with industrial partners or researchers from highly respected research units from abroad. A combined (external) form of study and full employment at KInIT is expected.
Supervising team: Jakub Šimko (supervisor, KInIT), Peter Brusilovsky (University of Pittsburgh), Peter Dolog (Aalborg University)
Keywords: generative AI, large language models, machine learning, human in the loop, crowdsourcing, human computation, active learning.
The models created in machine learning can only be as good as the data on which they are trained. Researchers and practitioners thus strive to provide their training processes with the best data possible. It is not uncommon to spend much human effort in achieving upfront good general data quality (e.g. through annotation). Yet sometimes, upfront dataset preparation cannot be done properly, sufficiently or at all.
In such cases the solutions, colloquially denoted as human-in-the-loop solutions, employ the human effort in improving the machine learned models through actions taken during the training process and/or during the deployment of the models (e.g. user feedback on automated translations). They are particularly useful for surgical improvements of training data through identification and resolving of border cases.
Human-in-the-loop approaches draw from a wide palette of techniques, including active and interactive learning, human computation, and crowdsourcing (also with motivation schemes of gamification and serious games). With recent emergence of large language models (LLM), the original human-in-the-loop techniques can be further boosted to create extensive synthetic training sets with comparatively small human effort.
The domains of application of human-in-the-loop are predominantly those with a lot of heterogeneity and volatility of data. Such domains include online false information detection, online information spreading (including spreading of narratives or memes), auditing of social media algorithms and their tendencies for disinformation spreading, support of manual/automated fact-checking and more.
Relevant publications:
The research will be performed at the Kempelen Institute of Intelligent Technologies (KInIT, https://kinit.sk) in Bratislava in cooperation with industrial partners or researchers from highly respected research units. A combined (external) form of study and full employment at KInIT is expected.
Supervising team: Viera Rozinajová (supervisor, KInIT), Anna Bou Ezzeddine (KInIT), Gabriela Grmanová (KInIT)
Key words: Machine learning models, scientific machine learning, knowledge representation, transfer learning, energy domain, environment
For decades, the behavior of systems in the physical world has been modeled by numerical models based on the vast scientific knowledge about the underlying natural laws. However, the increasing capabilities of machine learning algorithms are starting to disrupt this landscape. With large enough datasets, they can learn the recurring patterns in the data.
However, a pure machine learning model usually has poor interpretability, needs a lot of data to train which can be hard to come by in many scientific domains, and might not be able to generalize properly. These concerns are being addressed by the field of scientific machine learning (SciML) – an emerging discipline within the data science community. It introduces scientific domain knowledge into the learning process. SciML aims to develop new methods for scalable, robust, interpretable and reliable learning.
Physics-informed neural networks are a part of SciML – we include the physical constraints in the model through appropriate loss functions or tailored interventions into the model architecture. Through physics-informed machine learning, we can create neural network models that are physically consistent, data efficient, and trustworthy.
The goal of the research is to explore how to incorporate scientific knowledge into the machine learning models, thus creating hybrid models based on SciML principles that include both data-driven and domain-aware components. The research could also be directed towards a combination of SciML and transfer learning (that reuses a pre-trained model on a new problem). The aim of such a combination is to take advantage of both approaches.
SciML can be applied in many domains – we focus mainly on power engineering, e.g. supporting the adoption of renewables and on Earth science with emphasis on positive environmental impact improving climate resilience, but any other domain could be selected.
Relevant publications:
The research will be performed at the Kempelen Institute of Intelligent Technologies (KInIT, https://kinit.sk) in Bratislava in cooperation with industrial partners or researchers from highly respected research units from abroad. A combined (external) form of study and full employment at KInIT is expected.
Supervising team: Mária Bieliková (supervisor, KInIT), Róbert Móro (KInIT)
Keywords: generative AI, large language models, in-context learning, instruction fine-tuning, transfer-learning
Large language models (LLMs) are increasingly being used for a wide range of downstream tasks where they often show a good performance in zero/few-shot settings compared to specialized fine-tuned models, especially for tasks in which the LLMs can tap into the vast knowledge learned by them during the pre-training. However, they lag behind the specialized fine-tuned models in tasks requiring a more specific domain knowledge and adaptation. Additionally, they often suffer from problems such as hallucinations, i.e., outputting coherent, but factually false or nonsensical answers; or generating text laden with biases propagated from pre-training data. Various approaches have recently been proposed to address these issues, such as improved prompting strategies including in-context learning, retrieval-augmented generation or adapting the LLMs through efficient fine-tuning.
Each of these approaches (or combination thereof) presents opportunities for new discoveries. Orthogonal to this, there are multiple important factors of models like their level of alignment with human values, their robustness, explainability or interpretability and advances in this regard are welcome as well (generally in AI and particularly in the mentioned approaches).
There are many downstream tasks, where research of the LLM adaptation methods can be applied. These include (but are not limited to) false information (disinformation) detection, credibility signals detection, auditing of social media algorithms and their tendencies for disinformation spreading, and support of manual/automated fact-checking.
Relevant publications:
The research will be performed at the Kempelen Institute of Intelligent Technologies (KInIT, https://kinit.sk) in Bratislava in collaboration with industrial partners or researchers from highly respected research units involved in international projects. A combined (external) form of study and full employment at KInIT is expected.
Supervising team: Jakub Šimko (supervisor, KInIT), Dominik Macko (KInIT)
Keywords: generative AI, large language models, dataset creation, dataset augmentation, machine generated text detection, metrics and evaluation, machine learning
The advent of large language models (LLMs) is raising research questions about how to measure quality and properties of their outputs. Such measures are needed for benchmarking, model improvements or prompt engineering. Some evaluation techniques pertain to specific domains and scenarios of use (e.g., how accurate are the answers to factual questions in such and such domain? how well can we use the generated answers to train a model for a specific task?), others are more general (e.g., what is the diversity of paraphrases generated by an LLM? how easy to detect it is that the content is generated?).
Through replication studies, benchmarking experiments, metric design, prompt engineering and other approaches, the candidate will advance the methods and experimental methodologies of LLM output quality measurement. Of particular interest are two general scenarios:
The candidate will select (but will not be limited to) one of the two general scenarios, identify, and refine specific research questions and experimentally answer them.
Relevant publications:
The research will be performed at the Kempelen Institute of Intelligent Technologies (KInIT, https://kinit.sk) in Bratislava in cooperation with industrial partners or researchers from highly respected research units. A combined (external) form of study and full employment at KInIT is expected.
Supervising team: Marián Šimko (supervisor, KInIT), with eventual collaboration with Jana Kosecka (George Mason University) or Martin Hurban (ČSOB)
Keyword: large language models, natural language processing, trustworthy NLP, multilingual learning, information extraction
The recent development of large language models (LLMs) shows the potential of deep learning and artificial neural networks for many natural language processing (NLP) tasks. Advances in their automation have a significant impact on a plethora of innovative applications affecting everyday life.
Although large-scale language models have been successfully used to solve a large number of tasks, several research challenges remain. These may be related with individual natural language processing tasks, application domains, or the languages themselves. In addition, new challenges stemming from the nature of large language models and the so-called black-box nature of neural network-based models.
Further research and exploration of related phenomena is needed, with special attention to the problem of trustworthiness in NLP or new learning paradigms addressing the problem of low availability of resources needed for learning (low-resource NLP). Interesting research challenges that can be addressed within the topic include:
Relevant publications:
The research will be performed at the Kempelen Institute of Intelligent Technologies (KInIT, https://kinit.sk) in Bratislava in cooperation with — depending on selected subtopic — industrial partners or researchers from highly respected research units from abroad. A combined (external) form of study and full employment at KInIT is expected.
Supervising team: Michal Gregor (supervisor, KInIT), Marián Šimko (KInIT), Jana Kosecka (George Mason University)
Keywords: large language models, deep learning, machine learning, multi-modal, in-context learning, long context, fine-tuning
Large language models (LLMs) are powerful tools that can support a wide range of downstream tasks. They can be used e.g. in advanced conversational interfaces or in various tasks that involve retrieval, classification, generation, and more. Such tasks can be approached through zero-shot or few-shot in-context learning, or by fine-tuning the LLM on larger datasets (typically using parameter-efficient techniques to reduce memory and storage requirements).
Despite their unprecedented performance in many tasks, LLMs suffer from several significant limitations that currently hinder their safe and widespread use in many domains. These limitations include tendencies to generate responses not supported by the training corpus or input context (hallucination), difficulties in handling extremely long contexts (e.g., entire books), and limited ability to utilize other data modalities such as vision, where state-of-the-art models generally struggle to recognize fine-grained concepts.
The goal of this research is to explore such limitations, and – after selecting one or two of them to focus on – to propose new strategies to mitigate them. These strategies may include e.g.:
Relevant publications:
The application domain can be for example support for fact-checking and disinformation combatting, where the factuality of LLM outputs is absolutely critical.
The research will be performed at the Kempelen Institute of Intelligent Technologies (KInIT, https://kinit.sk) in Bratislava in cooperation with researchers from highly respected research units. A combined (external) form of study and full employment at KInIT is expected.
Supervising team: Mária Bieliková (KInIT guarantor), Peter Richtárik (KAUST), Martin Takáč (Mohamed Bin Zayed University), Peter Tino (University of Birmingham), Peter Dolog (Aalborg University)
Key words: machine learning, deep learning, learning theory, optimization, trustworthiness
Machine learning is in the centre of research of artificial intelligence. Many researchers worldwide are dealing with the topics related to machine learning, both in academia and industry. This very dynamic field is characterized with fast transfer of solutions into practical use.
The topics in this domain are defined by premier scientific conferences, where top-class researchers meet, for example ICML (International Conference on Machine Learning), NeurIPS (Advances in Neural Information Processing Systems), IJCAI (International Joint Conference on AI), COLT (Conference on Learning Theory).
This thesis will be advised by an external mentor, who will also define its particular topic.
Interesting research challenges are contained within (but are not limited to) these topics:
There are many application domains, where advanced machine learning methods can be deployed.
The research will be performed at the Kempelen Institute of Intelligent Technologies (KInIT, https://kinit.sk) in Bratislava in cooperation with researchers from highly respected research units. A combined (external) form of study and full employment at KInIT is expected.
We provide PhD study in collaboration with the Faculty of Information Technology at VUT, Brno (Czechia). You’ll be enrolled at the faculty as an external doctoral student. After you’re accepted for study, you’ll work as a full-time employee at our institute. Your doctoral research will be your primary job role.
You can choose from multiple interesting topics of your dissertation. They are based on different mentoring schemes:
You’ll have access to an expert supervising team. Depending on the selected scheme, it can include industrial partners and/or a scientist from a renowned foreign university.
While working at our institute you’ll be a member of a KInIT research team. The teams collaborate on various projects or on other activities:
The rest is up to you! Turn your curiosity and passion for intelligent technologies into new discoveries or inventions making the world a better place.
Mária Bieliková is an expert researcher at KInIT. She focuses on human-computer interaction analysis, user modeling and personalization. Recently, she has been working in data analysis and modeling of antisocial behavior on the Web. She is active in discussions on trustworthy AI at the national and European levels. Maria has supervised 19 successful doctoral graduates to date. She co-authored 70+ journal publications, 200+ conference papers, received 4,400+ citations (Google Scholar h-index 30), and serves on the editorial board of two CC journals. She has been the principal investigator in 40+ research projects.
Michal Gregor is an expert researcher at KInIT. He focuses especially on artificial neural networks and deep learning, on reinforcement learning, and more recently on multi-modal learning and learning that involves language supervision. Michal also has experience in other areas of AI such as metaheuristic optimization methods, representation of uncertain knowledge, probabilistic models and more.
Michal Kompan is an expert researcher at KInIT. He focuses on recommender systems, machine learning, user modeling, and information retrieval. His research is focused on predictive modeling and customer behavior (e.g., churn prediction, next-item recommendation), as well as content-based adaptive models. Michal serves as a reviewer or/and program committee member at several international conferences, such as RecSys, SIGIR, WWW, ADBIS, Hypertext, UMAP and SMAP.
Viera Rozinajová is an expert researcher at KInIT. She is focusing on intelligent data analysis, particularly predictive modeling, cluster analysis, anomaly detection and optimization. Before her employment at KInIT, she worked as an associate professor at the Faculty of Informatics and Information Technologies at the Slovak University of Technology in Bratislava, where she headed up the Big Data Analysis group. She has authored/co-authored more than 70 publications in scientific journals and conferences and has participated in more than 25 national and international research projects and has led several of them.
Jakub Šimko is an expert researcher at KInIT, where he also leads the Web and User Data Processing team. Jakub focuses on the intersection of human computation, machine learning and user modeling. He has recently been working on social media algorithm auditing, misinformation modeling and promotes interdisciplinary approaches to computer science research. He graduated from Slovak University of Technology in Bratislava, where, after receiving his PhD, he worked for 7 years as a researcher and teacher. He co-authored more than 30 internationally recognized publications, together receiving more than 350 citations.
Marián Šimko is an expert researcher at KInIT. Marián focuses on natural language processing, information extraction, low-resource language processing and trustworthiness of neural models. He is a former vice-dean for Master’s study and alumni co-operation at the Slovak University of Technology.
Peter Brusilovsky is a Professor at the School of Computing and Information, University of Pittsburgh, where he directs the Personalized Adaptive Web Systems (PAWS) lab. His research is focused on user-centered intelligent systems in the areas of adaptive learning, recommender systems, and personalized health. He is a recipient of Alexander von Humboldt Fellowship, NSF CAREER Award, and Fulbright-Nokia Distinguished Chair. Peter served as the Editor-in-Chief of IEEE Trans. on Learning Technologies, and a program chair for several conferences including RecSys.
Peter Dolog is an Associate Professor at the Department of Computer Science, Aalborg University, Denmark. His current research interests include machine learning and data mining in the areas of user behavior analysis and prediction, recommender systems, preference learning, and personalization. Peter is a senior member of ACM, served as a senior program commitee member of AI related conferences as well as a general chair of UMAP, HT and Web Engineering conferences.
Jana Kosecka is a Professor at the George Mason University. She is interested in computational models of vision systems, acquisition of static and dynamic models of environments by means of visual sensing, high-level semantic scene understanding and human-computer interaction. She held visiting positions at UC Berkeley, Stanford University, Google and Nokia Research, and served as Program chair, Area chair or senior member of editorial board for leading conferences in the field CVPR, ICCV, ICRA.
Jana is currently mentor of our PhD student: Ivana Beňová
Branislav Kveton is a Principal Scientist at Amazon’s lab in Berkeley. He proposes, analyzes, and applies algorithms that learn incrementally, run in real time, and converge to near-optimal solutions as they learn. He made several fundamental contributions to the field of multi-armed bandits. His earlier work focused on structured bandit problems with graphs, submodularity, and low-rank matrices, and ranked lists. His recent work focuses on making bandit algorithms practical
Peter Richtárik is a Professor of Computer Science & Mathematics at KAUST. He is one of the founders and a Fellow of the Alan Turing Institute. Through his work on randomized and distributed optimization algorithms, he has contributed to the foundations of machine learning and federated learning. He serves as an Area Chair of leading machine learning conferences, including NeurIPS, ICML and ICLR.
Martin Takáč is an Associate Professor and Deputy Department Chair of Machine Learning Department at MBZUAI, where he is a core faculty at the Optimization & Machine Learning Lab. His current research interests include the design and analysis of algorithms for machine learning, applications of ML, optimization, HPC. He serves as an Area Chair of ML conferences such as AISTATS, ICML, and NeurIPS.
Peter Tino is a Professor at the School of Computer Science, University of Birmingham, UK. He is interested in the interplay between mathematical modelling and machine learning (dynamical systems, probabilistic modelling, statistical pattern recognition, natural computation). Peter is interested in both foundational aspects and applications in interdisciplinary contexts (e.g. astrophysics, biomedical sciences, cognitive neuroscience).
He is a Fellow of the Alan Turing Institute and has served on editorial boards of leading journals such as IEEE TNNLS, IEEE TCYB, Neural Networks and Scientific Reports.
Transform your curiosity into excellence. If you are interested in doing a PhD degree at KInIT, let us know as soon as possible. Fill in the PhD @ KInIT expression of interest form no later than March 31st.