PhD Themes 2024: Improving Performance of Large Language Models for Downstream Tasks
Large language models (LLMs) are increasingly being used for a wide range of downstream tasks where they often show a good performance in zero/few-shot settings compared to specialized fine-tuned models, especially for tasks in which the LLMs can tap into the vast knowledge learned by them during the pre-training. However, they lag behind the specialized fine-tuned models in tasks requiring a more specific domain knowledge and adaptation. Additionally, they often suffer from problems such as hallucinations, i.e., outputting coherent, but factually false or nonsensical answers; or generating text laden with biases propagated from pre-training data. Various approaches have recently been proposed to address these issues, such as improved prompting strategies including in-context learning, retrieval-augmented generation or adapting the LLMs through efficient fine-tuning.
Each of these approaches (or combination thereof) presents opportunities for new discoveries. Orthogonal to this, there are multiple important factors of models like their level of alignment with human values, their robustness, explainability or interpretability and advances in this regard are welcome as well (generally in AI and particularly in the mentioned approaches).
There are many downstream tasks, where research of the LLM adaptation methods can be applied. These include (but are not limited to) false information (disinformation) detection, credibility signals detection, auditing of social media algorithms and their tendencies for disinformation spreading, and support of manual/automated fact-checking.
- Macko, D., Moro, R., Uchendu, A., Lucas, J.S., Yamashita, M., Pikuliak, M., Srba, I., Le, T., Lee, D., Simko, J. and Bielikova, M., 2023. MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing https://arxiv.org/abs/2310.13606
- Vykopal, I., Pikuliak, M., Srba, I., Moro, R., Macko, D., and Bielikova, M., 2023. Disinformation Capabilities of Large Language Models. Preprint at arXiv: https://arxiv.org/abs/2311.08838
The research will be performed at the Kempelen Institute of Intelligent Technologies (KInIT, https://kinit.sk) in Bratislava in collaboration with industrial partners or researchers from highly respected research units involved in international projects. A combined (external) form of study and full employment at KInIT is expected.
Mária Bieliková is an expert researcher at KInIT. She focuses on human-computer interaction analysis, user modeling and personalization. Recently, she has been working in data analysis and modeling of antisocial behavior on the Web. She is active in discussions on trustworthy AI at the national and European levels. Maria has supervised 19 successful doctoral graduates to date. She co-authored 70+ journal publications, 200+ conference papers, received 4,400+ citations (Google Scholar h-index 30), and serves on the editorial board of two CC journals. She has been the principal investigator in 40+ research projects.
Róbert Móro is a senior researcher at KInIT. He focuses on user modeling, personalization and machine learning. His current primary research interest is in countering online disinformation and modeling users and human-computer interaction on the Web. He has participated in several international research projects (including Horizon Europe) and served as a program committee member at several conferences (e.g., ACM UMAP, ACM ETRA, IJCAI).