The term bias can be understood in many ways, whether positive or negative. There are many types of biases, like historical, representation or societal bias. We can understand societal biases as a stereotypes that are based on demographic factors or physical characteristics with respect to various aspects like race, ethnicity, gender, sexual orientation, socioeconomic status or education. 

Such stereotypes are often deemed unfair. Unfortunately, the answer to the question whether AI systems could be completely fair is probably negative. This is particularly true for algorithms that learn from data, which can already carry certain biases. 

However, that does not mean that we should accept the situation as it is. We need to better understand the mechanisms behind AI bias and aim to minimise the potential risks before they cause unwanted harm to people.

Useful knowledge is not the only thing artificial intelligence adopts from us. Sometimes it also acquires our biases and prejudices. Unfortunately, we know almost nothing about biases in AI models working with Slovak language. But we are about to change that.

Online Workshop: Gender Biases in Artificial Intelligence

Collaboration between people from different fields of science and research is often mutually enriching and necessary, especially when it involves investigating and solving more complex problems. This is also the case for research on biases in artificial intelligence. Understanding and addressing them requires combining several disciplines, research approaches, and perspectives offered by, for example, computer science (especially AI), ethics, and even gender studies, which examine the meaning of gender in culture, society, and science.

In pursuit of an interdisciplinary approach, an online workshop entitled “Gender Bias in Artificial Intelligence” was held in March, which was attended by gender experts in addition to KInIT researchers. The main goal of the meeting was to gain lessons and guidance regarding a more professional consideration of the topic of gender in the project “Societal Biases in Slovak AI” and to find out whether there is a representative list of gender stereotypes in Slovakia or a corpus of linguistic expressions containing gender prejudices, which could be further explored through experimental work. The knowledge gained and the information resources provided should contribute to project outputs that will lead to a better understanding of gender bias in AI in Slovakia.

The content of the workshop included the presentation of the project and related research activities, including planned experiments, as well as an introduction to the issue of language models based on artificial intelligence. The subsequent discussion on methodological aspects of the research and the nature of gender bias provided an opportunity to compare and enrich our perspectives on the subject.

As a result of the discussion, we found out, for example, that although there is no representative corpus of linguistic expressions containing gender prejudices in the Slovak language, it is possible to compile a list of gender stereotypes using various sources (e.g., EIGE report – A study of collected narratives on gender perceptions in the 27 EU Member States). This list will serve as a basis for creating datasets containing various linguistic expressions intended for testing language models based on artificial intelligence. The essential elements of such a list were already defined during the workshop, creating a map of the main areas and themes in which gender stereotypes and prejudices appear, and which any representative list of stereotypes should include.

We are grateful to gender experts Adriana Jesenková, Mariana Szapuová, Jana Jablonická Zezulová, Jana Valdrová, and Veronika Valkovičová for their participation in the workshop.

Data-ethics assessment

Within the framework of the project “Societal Biases in Slovak AI,” a data-ethics assessment was carried out in February-April 2023 under the guidance of researchers from the Ethics and Human Values in Technology team at KInIT. The assessment was mainly focused on dealing with ethical risks regarding data collection, use, and publication in the context of the project’s research tasks. The purpose of such an assessment is primarily to avoid unethical behavior in the sense of not following certain ethical principles or violating values when working with data. Unethical behavior brings various negative consequences and ultimately reduces the quality of scientific research work. In addition, the ethical review process also shapes how researchers think ethically to some extent.

The data-ethics assessment took place mainly in the form of several workshops attended by the researchers participating in the project tasks, especially those who directly work with data. The assessment process consisted of several phases. The key ones included, the identification of stakeholders, the identification of ethical issues, and risk management. Risk management involved identifying risks, measuring the likelihood of their occurrence, the severity of their impact, and proposing appropriate countermeasures that could prevent the risks or at least minimize their negative impact on stakeholders.

Clarifying the research motives, aims, and benefits was also an important part of the ethical assessments. One of the essential impulses of the ethical assessment was the call for the involvement of people with gender expertise as important stakeholders. They participated in the assessment process and in fulfilling some of the project’s objectives. This call prompted, for example, the implementation of an expert workshop, scroll up this page to read more about it.

Final Event: Societal Biases in Slovak AI

We presented the project outcomes on October 18th 2023. You can watch the event recording here.

The event was in Slovak language, but the the presentation of Miro Dudík was in English – his presentation Putting AI Fairness Research into Practice kick off at 01:13:00.

 Results

  • Report on the Current State of Societal Biases in Slovak AI – It documents our work and provides information about the main findings. This report is intended to provide information for both the general public and AI experts. One of its main goals is to increase awareness of the issue. Additionally, the report also provides a general overview of our project and can be used by researchers to analyze our work and results.
  • Data and Code – The data we have collected for various types of AI systems can be used by AI developers to measure and tackle the bias in their systems, as well as by other researchers. All the data are published and freely available. Although we focus on the Slovak language, the data can be reused to study this issue in other languages as well.
This project is funded by the U.S. Embassy in Bratislava.