F Final Report

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Data Colonialism on Generative AI: reflections about bias on

how Large Language Models portraits the Global South

Silvia DalBen Furtado

Abstract
Content produced by Large Language Models (LLMs) often reinforce stereotypes and
social biases, which can be particularly harmful for marginalized communities. This
study aims to evaluate how LLMs represent cultures from the Global South, with a focus
on AI generated images of a person, a student, and a street in 20 countries in Latin
America. We argue that LLMs training data need to expand their sources following a de-
colonial approach increasing geographic representations and sources from languages
other than English.

Keywords: Data Colonialism, Generative AI, Large Language Models, bias,


stereotypes, Latin America

Introduction

Content produced by Large Language Models (LLMs) such as ChatGPT and


Dall-E often reinforce stereotypes and social biases, which can create discrimination
and marginalize those who exist outside AI classification norms (Weidinger et al., 2021).
Efforts to de-bias these Generative AI models usually focus on evaluating and
improving the output, for both text and images, which includes “AI red teaming” and
“Reinforcement Learning from Human Feedback” (RLHF) that identifies undesirable
behaviors and apply methods to align these systems. However, these most common
strategies to mitigate bias on LLMs neither alter the training data nor consider the
geographic bias perpetuated by those technologies.
Large Language Models (LLMs) are machine learning models that are trained to
predict the next word based on a large dataset of texts called corpora usually retrieved
from the internet. Nearly 64% of the content available online is in English, while only 5%
of the world speaks English at home. This means LLMs’ training data does not reflect
equally different cultures and geographic regions, and bends towards an Anglo-Saxon
perspective due to different levels of internet access, infrastructure and digitalization
worldwide (Solaiman & Talat et al., 2023). Moreover, 37% of the world’s population has
never accessed the Internet, and 96% of them live in a developing country, which
deepen the inequalities as these marginalized voices are not part of LLMs training data
(Wang, Morgenstern & Dickerson, 2024).
Additionally, considering that the text corpus used by LLMs are built on news
texts, they will reflect many of the same bias and stereotypes of its sources (Trattner et
al., 2022) and reinforce structural inequalities perpetuated by media. Marginalized
groups are usually underrepresented in news stories, which traditionally are based on
news values and practices that privilege an elite perspective (Wasserman, 2010).
Commonly they are “silently annihilated’ (Tuchman, 1978) in media coverage, or
included as sources of societal problems (González de Bustamante & Retis, 2016). If
LLMs’ training data rely on content that misrepresent minorities and emphasizes
structural inequalities, they will reproduce and augment those biases already identified
in news coverage.
For example, ChatGPT associates a Latin American worker with low-paid jobs
and “skills related to agriculture, manufacturing, or service industries,” similar to an
African worker that may have “skills related to agriculture, mining, construction, or other
industries.” In contrast, a North American worker is associated with high-paid
professions that “tends to have a strong emphasis on technical and analytical skills”,
similarly to a European worker that usually have “skills related to technology,
engineering, finance, or other specialized fields.”
When asking if LLMs overemphasize Global North perspectives in their outputs,
ChatGPT recognizes that it “may indeed exhibit a tendency to overemphasize topics,
perspectives, and cultural references from the Global North, primarily due to the nature
of the data they are trained on”. The reasons pointed out by ChatGPT are the
dominance of English language, cultural references, and knowledge gaps.
Questioning how we should get a better representation of the Global South,
ChatGPT answered that it “requires a conscious effort to center the experiences,
perspectives, and priorities of countries and regions”. It suggests to “listen to voices
from the Global South”, “understand historical and colonial contexts”, “support
decolonial and post-colonial perspectives”, and “foster south-south solidarity and
cooperation”.
Inspired by discussions about Data Colonialism (Couldry & Mejias, 2019;
Ricaurte, 2019), Algorithmic Colonialism (Birhane, 2020), and efforts to decolonize AI
technologies, this study aims to evaluate in what extend Large Language Models are
biased and stereotyped when they represent cultures from the Global South. Are the
de-biasing actions applied in generative AI systems concerned in increasing the cultural
diversity of datasets?
To evaluate if the outputs are biased towards a Global North perspective, we
focused on AI images generated by Microsoft Copilot, which runs Dall-E 3, using three
different prompts (person, student, and street) and twenty Latin American countries
localized in North America, South America, Central America, and the Caribbean.
Our analysis shows that Latin American countries are represented in a
homogenized way notwithstanding their cultural and natural diversity. AI images
privilege young people with lighter skin tone from both genders, using traditional
clothes. Streets are usually occupied by people and street vendors, surrounded by a
historical colonial architecture. We argue that LLMs training data need to expand their
sources following a de-colonial approach including more perspectives from the Global
South.

Bias and Stereotypes in Large Language Models

Currently, there is no consensus to determine the social impacts of an AI system,


and they are often dependent on the context in which they are developed, deployed,
and used. Considering that geographic and cultural contexts shift, what constitutes a
sensitive content also varies depending on who and where the LLMs have been used
(Solaiman & Talat et al., 2023). Therefore, cultural values are a big challenge for a LLM
that intends to be used globally as what seems appropriate for one perspective can
cause harm for another. “Generative AI systems cannot be neutral or objective, nor can
they encompass truly universal values. There is no ‘view from nowhere’; in evaluating
anything, a particular frame of reference is imposed” (Solaiman & Talat et al., 2023).
OpenAI (2023ª) implemented various safety measures and processes to diminish
the risk, however they recognize that GPT-4 is still vulnerable to adversarial attacks,
and it has the potential to generate harmful content. As stated in the GPT-4 System
Card (OpenAI, 2023ª), while mitigation actions can prevent certain kinds of misuse,
these interventions are limited and, among the risks, “language models can amplify
biases and perpetuate stereotypes” by generating potentially harmful content that can
reflect worldviews that may not be representative for all users, especially those from
marginalized groups. LLMs have the tendency to hallucinate – produce nonsensical or
untruthful content –, and they can also incite violence and discrimination (OpenAI,
2023ª). These AI systems have the “potential to reinforce entire ideologies, worldviews,
truths and untruths, and to cement them or lock them in, foreclosing future contestation,
reflection, and improvement” (OpenAI, 2023ª).
While evaluating demographic biases in Dall-E 3, OpenAI intended to add
“groundedness” through the process of “prompt expansion” regarding gender and race,
to produce a more diverse output that represents a broad range of identities and
experiences (OpenAI, 2023b). Despite this, OpenAI assumes that bias remains a
problem in Generative AI models, with or without mitigation actions. They confirm that,
by default, Dall-E 3 tend to produce images with disproportional representations of
white, female and youth individuals, that generally privileges a Western point-of-view
(OpenAI, 2023b).
Fine-tuning the output based on demographic data is also a big challenge to
mitigate social bias in LLMs, as they tend to misportray and flatten the representation of
demographic groups, and essentialize identities based on stereotypes (Wang,
Morgenstern & Dickerson, 2024). LLMs usually erases group heterogeneity as their
outputs are generated based on the maximum likelihood estimation. LLMs fail to
recognize within-group heterogeneity and nuances, tending to portray marginalized
groups one-dimensionally, flattening their diversity and individuality (Wang, Morgenstern
& Dickerson, 2024). Furthermore, when verbalizing their confidence, LLMs tend to be
overconfident, which can be mitigated by human-inspired prompts, consistency among
multiple responses and better aggregation strategies, however these improvements are
reduced if the model capacity scales up (Xiong et al., 2023).
Another issue that poses in the challenge to mitigate bias in Large Language
Models is that most of those AI systems are developed by tech companies based in the
United States and China, and they reflect values and bias of those perspectives.
Therefore, most studies evaluating the social impacts, stereotypes and biases are
designed in a U.S. and Chinese context and does not include other geographic
populations. For example, a study about whose opinion’s language models reflect
concludes that Large Language Models usually tend to represent the perspective of a
moderate to conservative low-income population, which shifts to a more liberal with high
income after RLHF fine-tuning, still it misrepresents 65+ population, Mormons and
widows (Santurkar et al., 2023). However, similarly to other research in computational
linguistics, this study is designed considering a U.S. centric scenario, which limits its
findings scope and makes its assumptions not suitable for other geographic regions.
“Who are the humans that we are/should be aligning the models to?”, asked the authors
in a provocative tone. (Santurkar et al., 2023).

Evaluating De-Biasing Actions

Among de-biasing techniques applied in Large Language Models, “AI red


teaming” refers to structured testing efforts to identify flaws and vulnerabilities such as
harmful and discriminatory outputs. Besides its large use by AI companies, there is a
lack of transparency on how these “AI red teaming” work and their goals. They are
usually poorly structured, there are no standards or systematic procedures for
disclosing the results of their work, their costs are not revealed, and a broad range of
vulnerabilities increases the risk to introduce bias into the models (Feffer et al., 2024). In
other words, the opacity regarding the work done by red teaming and the hesitancy of
AI companies to publicly release methods and results reduces the reliability of this
method to de-biasing LLMs.
One of the most common ways to mitigate bias in LLMs is by flagging certain
keywords and train the models to refuse answering certain questions. Nevertheless, this
approach can also exacerbate bias and reinforce discrimination by refusing to generate
content for one demographic group but complying for another (OpenAI, 2023ª). In order
to minimize risks, GPT models are fine-tuned using a method called “Reinforcement
Learning from Human Feedback” (RLHF), which makes these models more controllable
and useful by producing outputs based on human labelers preferences. “RLHF fine-
tuning makes our models significantly safer” (OpenAI, 2023ª).
RLHF involves three training stages, which starts by pretraining the language
models to filter inappropriate content (e.g. erotic) from the dataset. Then, human
annotators will evaluate and rank generated text outputs, that will be used to train a
reward model (RM). Third, this RM will be used to optimize the original language model
by predicting the human preferences using reinforcement learning (RL) (Lambert et al.,
2022). This RLHF fine-tuning process generates parameters applied in the language
model by a “policy-gradient RL algorithm”, called Proximal Policy Optimization (PPO),
which maximizes the reward metrics into the training data. “PPO is a trust region
optimization algorithm that uses constraints on the gradient to ensure the update step
does not destabilize the learning process” (Lambert et al., 2022). One of the limitations
of the RLHF method is its tendency to generate longer outputs after the finetuning
process (Singhal et al., 2023).

Methods

To evaluate how biased and stereotyped LLMs are towards a Global North
perspective, this study collected 1,260 AI generated images1 using the Microsoft
Copilot Designer, which runs the Dall-E, focus on Latin America. We used three
different prompts: “a [nationality] person”, “a [nationality] student” and “a [nationality]
street” and twenty countries from North America (Mexico), South America (Argentina,
Bolivia, Brazil, Chile, Colombia, Ecuador, Paraguay, Peru, Uruguay, Venezuela),
Central America and the Caribbean (Costa Rica, Cuba, Dominican Republic, El
Salvador, Guatemala, Haiti, Honduras, Nicaragua, Panama).

1
AI generated Images are available at https://utexas.box.com/s/039hl7bfb6fm4mxvepvt8ev39dvsz9qd.
After collecting the images, we manually coded them using different categories.
For person and student, we labeled the images by gender, skin tone using the
Fitzpatrick scale, age, background, and type of clothing. For street images, we analyzed
the architecture, type of cars, and if they have street vendors.
This methodology was designed inspired by two data journalism projects that
investigated bias and stereotypes in AI generated images, focused on professions
(Nicoletti & Bass, 2023) and different cultures (Turk, 2023) that used Stable Diffusion
and Midjourney respectively. We adapted their analysis to first emphasize the Global
South, choosing Latin America countries as a case study, and second using a different
AI image generator, in this case Copilot / Dall-E 3.

Latin America is not a country!

In a homogenized representation, the images generated by Dall-E 3 portraits


different countries in Latin America as one only population, despite their cultural and
natural diversity. These images also privilege young people (81.7%) with lighter skin
tone (70.9%) from both genders, using traditional clothing (86%).
There is a stereotypical perception of Latin America as a homogeneous region
(Friedrich, Mesquita & Hatum, 2005) and as a monolithic group that is reflected in US
popular cultural and in the US media (Alarcón, 2014). Latin Americans are often
pictured as living in the past and essentially in rural areas, although nearly 72 percent of
the population lives in cities with more than one hundred thousand inhabitants. “In
reality, there are more cities of greater than five hundred thousand inhabitants in
contemporary Latin America than in the United States” (Alarcón, 2014).
This Latin American myth of homogeneity is perpetuated in the training data used
by LLMs, considering the majority part of sources that compose these datasets are in
English and the biggest tech companies that develop those technologies are geographic
based in the United States. Nevertheless, diversity in Latin America is the norm, rather
than an exception, and it is erroneous to classify it as a monolithic group without
considering the vast differences between countries and regions (Alarcón, 2014).
Figure 1 – Similar pictures of people from different Latin American countries

These images (Figure 1) show four young white people using traditional clothing
and hats that stand out for their vibrant colors. But surprisingly, they are from four
different nationalities: Brazil, Chile, Costa Rica, and Colombia. A similar result can be
seen in the images below (Figure 2) in which four young people are surrounded by fruits
and dressed in traditional clothing. They are from Nicaragua, Guatemala, Mexico and El
Salvador.

Figure 2 – AI generated images represent Latin Americans surrounded by fruits

Ignoring the homogenized representation of different countries and populations in


Latin America, these images are usually followed by subtitles that emphasize their
cultural diversity, such as “I hope you find it captures the vibrant and diverse spirit of
Cuba!”, or “I hope it brings the rich culture of Chile to life for you!” or “The image
showcases diversity and creativity, reflecting the rich heritage of Nicaragua.”
In a quantitative analysis of the images generated by the prompt “person” (Figure
3), we observe that there is a balance on gender representation, however these images
mainly represent young people (81.7%) posing in a nature landscape (43.3%), using
traditional clothing (86%). Furthermore, mainly have lighter skin tone (70.9%) and
European facial features, with a low representation of indigenous (18.7%) and black
(10.8%) populations.

Figure 3 – Statistical analysis of AI generated images by “Person”

In a qualitative analysis of representation based on skin tone, we can observe


these four images generated with the prompt “a Brazilian student” (Figure 4), in which
all the students in the spotlight have light skin. Black students are seen in the
background, occupying the second, third or fourth row in the classroom. This
misrepresentation does not account that 55% of the population in Brazil identifies
themselves as black or brown, according to Census 2022.

Figure 4 – The “whiteness” of a Brazilian Student


Similar to the previous examples, Figure 5 show streets from four different Latin
American countries that are occupied with informal vendors and people, reinforcing a
representation of an underdeveloped economy. Moreover, the architecture emphasizes
a colonial style instead of contemporary constructions. Images were generated using
Peru, Colombia, Honduras, and Haiti in their prompts.

Figure 5 - Streets full of vendors and people

When cars appear in the images (Figure 6), they make up a bucolic scene
surrounded by colonial architecture, palm trees and flowers. These old cars are usually
associated with Cuba, but in Dall-E 3 they surpass the boarders and are also
represented in streets in Dominican Republic, Argentina, and Uruguay.

Figure 6 – Cuban Old Cars

Conclusion
To exemplify how LLMs portrays different countries from the Global South, this
study showed that AI generated images usually represent Latin American countries a
homogenized way as if they were the same. Moreover, these representations
reinforce colonial perspectives and historical “underdeveloped” stereotypes, instead
of contemporary viewpoints. It also privileges young white beauty with a tropical taste,
misrepresenting the cultural diversity of indigenous and blacks that compose an
important demographic stratus.
To improve the quality of text and images generated, Large Language Models
need to expand the sources that compose their training data following a de-colonial
approach that includes more perspectives from the Global South. In addition to fine-
tuning language models using RLHF and red teaming, de-biasing actions should also
focus on increase geographic representations and cultural diversity, incorporating more
sources from languages other than English in the training data.
This study suggests three actions that could be taken in this direction. First,
LLMs databases need to expand their sources including more perspectives from
marginalized communities, especially from the Global South, to increase the diversity
and create more diverse outputs. Second, specific content moderation teams should be
assigned to evaluate prompts and outputs that target minority groups with a broader
perspective that includes Global South perspectives and is less Anglo-Saxon centric.
Finally, companies in charge of developing LLMs should adopt a more transparent
approach, disclosing the training data used by their generative AI tools and sharing the
links of the sources used to generate the outputs. This action would facilitate auditing
actions and detection of knowledge gaps.

REFERENCES

Birhane, A. (2021). Algorithmic injustice: a relational ethics approach. Patterns, 2(2),


100205.

Birhane, A., & Guest, O. (2020). Towards decolonising computational sciences. arXiv
preprint arXiv:2009.14258.

Birhane, A. (2020). Algorithmic colonization of Africa. SCRIPTed, 17, 389.

BlodgeV, S. L., Barocas, S., Daumé III, H., & Wallach, H. (2020). Language
(technology) is power: A critical survey of" bias" in nlp. arXiv preprint arXiv:2005.14050.
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan,
A., Shyam, P., Sastry, G., Askell, A. and Agarwal, S. (2020) Language models are few-
shot learners. Advances in neural information processing systems, 33, 1877-1901.

Couldry, N., & Mejias, U. A. (2019). Data colonialism: Rethinking big data’s relation to
the contemporary subject. Television & New Media, 20(4), 336-349.

Feffer, M., Sinha, A., Lipton, Z. C., & Heidari, H. (2024). Red-Teaming for Generative
AI: Silver Bullet or Security Theater? arXiv preprint arXiv:2401.15897

Ferrara, E. (2023). Should ChatGPT be biased? Challenges and risks of bias in large
language models. arXiv preprint arXiv:2304.03738.

Lambert, N., Castricato, L., von Werra, L. & Havrilla, A. (2022, December 9) Illustrating
Reinforcement Learning from Human Feedback (RLHF). Hugging Face.
https://huggingface.co/blog/rlhf

Meade, N., Poole-Dayan, E., & Reddy, S. (2021). An empirical survey of the
effectiveness of debiasing techniques for pre-trained language models. arXiv preprint
arXiv:2110.08527.

Nicoletti, L. & Bass, D. (2023) Humans are biased. Generative AI is even worse.
Bloomberg. Retrived from: https://www.bloomberg.com/graphics/2023-generative-ai-
bias/

OpenAI (2023ª March 23) Gpt-4 system card, https://cdn.openai.com/papers/gpt-4-


system-card.pdf

OpenAI (2023b October 3) Dall-E 3 system card,


https://cdn.openai.com/papers/DALL_E_3_System_Card.pdf

Ricaurte, P. (2019). Data epistemologies, the coloniality of power, and resistance.


Television & New Media, 20(4), 350-365.

Santurkar, S., Durmus, E., Ladhak, F., Lee, C., Liang, P., & Hashimoto, T. (2023, July).
Whose opinions do language models reflect?. In International Conference on Machine
Learning (pp. 29971-30004). PMLR.

Singhal, P., Goyal, T., Xu, J., & Durrett, G. (2023). A long way to go: Investigating
length correlations in RLHF. arXiv preprint arXiv:2310.03716.

Irene Solaiman, Zeerak Talat, William Agnew, Lama Ahmad, Dylan Baker, Su Lin
Blodgett, Hal Daumé III, Jesse Dodge, Ellie Evans, Sara Hooker, et al. Evaluating the
social impact of generative ai systems in systems and society. arXiv preprint
arXiv:2306.05949, 2023.
Trattner, C., Jannach, D., Motta, E., Costera Meijer, I., Diakopoulos, N., Elahi, M., ... &
Moe, H. (2022). Responsible media technology and AI: challenges and research
directions. AI and Ethics, 2(4), 585-594.

Tuchman, G. (1978) “Introduction: The Symbolic Annihilation of Women by the Mass


Media,” in G. Tuchman, A. K. Daniels, and J. Benet (eds.), Health and Home: Images of
Women in the Mass Media. New York: Oxford University Press, 3–38.

Turk, V. (2023, October 10) How AI reduces the world to stereotypes. Rest of World.
Retrived from: https://restofworld.org/2023/ai-image-stereotypes/

Wang, A., Morgenstern, J., & Dickerson, J. P. (2024). Large language models cannot
replace human participants because they cannot portray identity groups. arXiv preprint
arXiv:2402.01908.

Wasserman, H. (2010). Media ethics and human dignity in the postcolony. Media ethics
beyond borders: A global perspective, 74-89.

Weidinger, L. and Mellor, J. and Rauh, M. andGriffin, C. and Uesato, J.and Huang, P.
and Cheng, M. and Glaese, M. and Balle, B. and Kasirzadeh, A. and Kenton, Z. and
Brown, S. and Hawkins, W. and Stepleton, T. and Biles, C. and Birhane, A. and Haas,
J. and Rimell, L. and Hendricks, L. and Gabriel, I. (2021). Ethical and social risks of
harm from Language Models arXiv preprint arXiv:2112.04359.
hVps://arxiv.org/abs/2112.04359

Weidinger, L., Uesato, J., Rauh, M., Griffin, C., Huang, P. S., Mellor, J., Glaese, A.,
Cheng, M., Balle, B., Kasirzadeh, A., Biles, C., Brown, S., Kenton, Z., Hawkins, W.,
Stepleton, T., Birhane, A., Hendricks, L. A., Rimell, L., Isaac, W., Haas, J., Legassick,
S., Irving, G., & Gabriel, I. (2022, June). Taxonomy of risks posed by language models.
In 2022 ACM Conference on Fairness, Accountability, and Transparency Transparency
(FAccT '22). Association for Computing Machinery, New York, NY, USA, 214–229.
https://doi.org/10.1145/3531146.3533088

Xiong, M., Hu, Z., Lu, X., Li, Y., Fu, J., He, J., & Hooi, B. (2023). Can LLMs Express
their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs. arXiv
preprint arXiv:2306.13063.

Zhou, Y., Kantarcioglu, M., & CliAon, C. (2021). Improving fairness of ai systems with
lossless de- biasing. arXiv preprint arXiv:2105.04534.

You might also like