KInIT at the EMNLP conference in Singapore 

The EMNLP (Empirical Methods in Natural Language Processing) conference is one of the top NLP conferences. Our researchers Róbert Móro and Ján Čegiň presented 3 full papers at this prestigious event.

Three KInIT Papers accepted for the main track

We submitted three papers to this renowned NLP conference and all of them were accepted, which is a significant milestone in our publishing journey.

1. MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark  

This paper represents a collaborative effort between the Penn State University, MIT Lincoln Laboratory, University of Mississippi and KInIT, showcasing a solid example of international cooperation. The paper is the result of our work on the VIGILANT and projects. In the paper, we address the problem of detecting machine-generated text in 11 languages, by building and publishing a benchmarking dataset for this task and comparing existing state-of-the-art methods.

The paper was presented by Róbert Móro, you can check it out here.

2. Multilingual Previously Fact-Checked Claim Retrieval

In this paper, we focused on the problem of retrieving previously fact-checked claims for different languages. We presented a unique dataset for this task, covering over 30 languages, and performed an extensive comparison of different models for representing claims on this task. Róbert Móro presented this paper, it is  the result of our work on the DisAI and CEDMO projects. 

The paper is available here.

3. ChatGPT to Replace Crowdsourcing of Paraphrases for Intent Classification: Higher Diversity and Comparable Model Robustness

The last paper was  presented by our PhD student Ján Čegiň. One of the co-authors is Peter Brusilovsky from the University of Pittsburgh. The paper is the result of our work on the CEDMO and projects. It explores the possibilities of using large language models (LLMs) such as ChatGPT to generate training data using paraphrasing for the “intent classification” task. 

The paper is available here.

Meeting with partners

Besides talking to and getting to know many researchers in the NLP community in general, the conference was also an opportunity to get together with some of our co-authors and project partners. Meeting our co-authors of the MULTITuDE paper  (Adaku Uchendu, Jason Samuel Lucas and Michiharu Yamashita from Penn State University and MIT Lincoln Laboratory) was wonderful.

An interesting part of the conference program was the presentation by Freddy Heppell from the University of Sheffield. KInIT is collaborating with the University of Sheffield  on the, VIGILANT and ExU projects. 

It was also great to see Simon Ostermann from DFKI, with whom we are collaborating on the DisAI project.

About the conference

More than 2000 participants attended the conference from 8 to 10 December 2023. Our PhD student Ján Čegiň was also one of the student volunteers who helped organize the conference on site. Next year’s conference will take place in Miami.