Research areas: natural language processing, computational linguistics, data processing, machine learning, computer-assisted learning, computer aided education
Position: Research Engineer
Miroslav has a background in natural language processing tasks, combining linguistic and machine learning approaches. He has built several tools, lexicons and knowledge sources used in Slovak language processing.
He has experience with several NLP tasks: automatic question generation, tokenization, lemmatization, stemming, part-of-speech tagging, named entity recognition, diacritic restoration and text reconstruction, term extraction, text simplification, text similarity, sentiment analysis, coreference resolution, temporal data extraction and legal document extraction.
As a teacher at the Slovak University of Technology, he supervised more than 20 Bachelor’s and Master’s theses in NLP. He also has experience in the processing and representation of structured and textual data, as well as in software engineering (software development and software architecture).
Aspecta: Improving Public Procurement using Natural Language Processing
Other notable projects
DisAI: Improving scientific excellence of KInIT in AI and language technologies to fight disinformation
AI4Europe: The unified platform for boosting European AI academic and industrial research
Automatic question generation based on sentence structure analysis using machine learning approach
Selected Student Supervising
- Lukáš Radoský – Similarity of short texts in Slovak language. Defended 2021
- Simona Zelenčíková – Coreference resolution in text. Defended 2021.
- Martin Grega – Temporal sequence identification in the text. Defended 2021.
- Richard Galeštok – Automated analysis of sentiment in Slovak texts. Defended 2020.
- Lukáš Belaj – Named entity extraction from Slovak text. Defended 2019.
- Lukáš Miškovský – Identification of coreference links in text. Defended 2017.
- Michal Hunák – Detecting tricky plagiarism. Defended 2020.
- Juraj Gemeľa – Building a Dictionaries by Games. Defended 2019.
- Ondrej Harnúšek – Tool for determining similarity of texts. Defended 2019.
- Lukáš Radoský – Lexicon construction using games. Defended 2019.
- Tomáš Gábrš – Question generation from educational text. Defended 2017.
- Martin Nemček – Educational texts processing. Defended 2016.