Slovak Natural Language Processing Community

Our goal is to foster collaboration in education, research, progress and innovation focused on natural language processing and language technologies for the Slovak language.

Our ambition is to improve the state-of-the-art in automated Slovak language processing and its related fields with gradual involvement of all elements of the Slovak innovation ecosystem. The Slovak NLP community brings together and connects people working on language processing in research, industry, and education, supports knowledge sharing, and creates opportunities for new collaborations.

Operation and working groups 

The current operating model is based on coordination meetings, typically held twice a year, and continues work in working groups focused on addressing partial open problems in NLP or related areas. The community is coordinated by the Kempelen Institute of Intelligent Technologies.

Working groups include:

  • Steering Committee
  • Benchmarking models for Slovak language
  • Infrastructure and resources for NLP

Partners and teams

Currently, academic teams are actively involved. A long-term ambition is to expand participation toward industry, public institutions, and other partners.

  • Kempelen institute of Intelligent Technologies
    • Natural Language Processing (NLP) Research Team
    • Web and User Data Processing (WUDAP) Research Team
  • Technical University of Košice
    • Laboratory of Speech Communication Technologies, KEMT FEI
    • Laboratory of Intelligent Multimodal Data Analysis, KKUI FEI
  • Slovak Academy of Sciences
    • Ľ. Štúr Institute of Linguistics
    • Institute of Informatics, Department of Speech Analysis and Synthesis
    • Computing Centre
  • Comenius University in Bratislava – NaiveNeuron Research Group (FMFI)
  • Constantine the Philosopher University in Nitra – NLP Lab, Department of Computer Science
  • Pavol Jozef Šafárik University in Košice – Institute of Informatics, Faculty of Science

Meetings

#3 TUKE, Košice, 20.11.2025

#4 TBA

Selected Results and Outputs

Benchmarks, Models, etc.

  • skLEP – a GLUE-style benchmark (repo, paper)
  • SkMTEB – text embedding benchmark (repo, paper)
  • mistral-sk-7b – repo
  • Qwen3-14B-sk – repo
  • List of NLP resources for Slovak language processing – repo

Other

  • Statement on the importance of language technology research for the competitiveness of Slovakia (September 2024)
  • Memorandum of Understanding on cooperation in the development of natural language processing and language models for Slovak language (September 2025)
  • Establishment of the CLARIN-SK association (February 2026)

Slovak NLP Community Online

Related Activities

  • NLP (Summer) Schools – a regular NLP summer school for those interested in learning about language technologies
  • Better_AI meetups – regular gatherings for AI enthusiasts; selected sessions focused on NLP: Vol_15, Vol_13, Vol_08, Vol_07, Vol_1