UCSC Researchers’ Tool Finds Bias in State-of-the-Art Generative AI model

August 16, 2023 Malina Long

Examples of images generated by text prompts imputed to the Stable Diffusion model with and without gender-specific language in the prompt. For example, the upper left group of four images were produced from the prompt "child studying science."

Article by Emily Cerf via UC Santa Cruz Newscenter

Assistant Professor of Computer Science and Engineering, Xin (Eric) Wang.

Text-to-image (T2I) generative artificial intelligence tools are increasingly powerful and widespread tools that can create nearly any image based on just a few inputted words. T2I generative AI can create convincingly realistic photos and videos which are being used more and more for a multitude of purposes, from art to political campaigning.

However, the algorithmic models that power these tools are trained on data from humans, and can replicate human biases in the images they produce, such as biases around gender and skin tone. These biases can harm marginalized populations, reinforcing stereotypes and potentially leading to discrimination.

To address these implicit biases, Assistant Professor of Computer Science and Engineering Xin (Eric) Wang and a team of researchers from Baskin Engineering at UC Santa Cruz created a tool called the Text to Image Association Test, which provides a quantitative measurement of complex human biases embedded in T2I models, evaluating biases across dimensions such as gender, race, career, and religion. They used this tool to identify and quantify bias in the state-of-the-art generative model Stable Diffusion.

The tool is detailed in a paper for the 2023 Association for Computational Linguistics (ACL) conference, a premier computer science conference, and is available for use in a demo version.

“I think both the model owners and users care about this issue,” said Jialu Wang, a UCSC computer science and engineering Ph.D. student and the first author on the paper. “If the user is from an unprivileged group, they may not want to see just the privileged group reflected in the images they generate.”

To use the tool, a user must tell the model to produce an image for a neutral prompt, for example “child studying science.” Next, the user inputs gender specific prompts, such as “girl studying science” and “boy studying science.” Then, the tool calculates the distance between the images generated with the neutral prompt and each of the specific prompts. That difference between those two distances is a quantitative measurement of bias.

Using their tool, the research team found that the state-of-the-art generative model Stable Diffusion both replicates and amplifies human biases in the images it produces. The tool tests the association between two concepts, such as science and arts, to two attributes, such as male and female. It then gives an association score between the concept and the attribute and a value to indicate how confident the tool is in that score.

The team used their tool to test whether the model associates six sets of opposing concepts with positive or negative attributes. The concepts they tested were: flowers and insects, musical instruments and weapons, European American and African American, light skin and dark skin, straight and gay, and Judaism and Christianity. For the most part, the model made associations along stereotypical patterns. However, the model associated dark skin as pleasant and light skin as unpleasant, which surprised researchers as one of the few results in opposition to common stereotypes.

Additionally, they found that the model associated science more closely with males and art more closely with females, and associated careers more closely with males and family more closely with females.