A New Vision of Artificial Intelligence For The People

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

12/5/24, 8:11 A new vision of artificial intelligence for the people | MIT Technology Review

EDEL RODRIGUEZ

ARTIFICIAL INTELLIGENCE

A new vision of
artificial intelligence
for the people
ln a remote rural town in New Zealand, an Indigenous
couple is challenging what AI could be and who it
should serve.

by Karen Hao
April 22, 2022

https://www.technologyreview.com/2022/04/22/1050394/artificial-intelligence-for-the-people/ 1/18
12/5/24, 8:11 A new vision of artificial intelligence for the people | MIT Technology Review

This story is the fourth and final part of MIT Technology Review’s series on AI colonialism,
the idea that artificial intelligence is creating a new colonial world order. It was supported
by the MIT Knight Science Journalism Fellowship Program and the Pulitzer Center. Read
the full series here.

In the back room of an old and graying building in the northernmost region of New
Zealand, one of the most advanced computers for artificial intelligence is helping to
redefine the technology’s future.

Te Hiku Media, a nonprofit Māori radio station run by life partners Peter-Lucas
Jones and Keoni Mahelona, bought the machine at a 50% discount to train its own
algorithms for natural-language processing. It’s now a central part of the pair’s
dream to revitalize the Māori language while keeping control of their community’s
data.

Mahelona, a native Hawaiian who settled in New Zealand after falling in love with
the country, chuckles at the irony of the situation. “The computer is just sitting on a
rack in Kaitaia, of all places—a derelict rural town with high poverty and a large
Indigenous population. I guess we’re a bit under the radar,” he says.

The project is a radical departure from the way the AI industry typically operates.
Over the last decade, AI researchers have pushed the field to new limits with the
dogma “More is more”: Amass more data to produce bigger models (algorithms
trained on said data) to produce better results.

The approach has led to remarkable breakthroughs—but to costs as well.


Companies have relentlessly mined people for their faces, voices, and behaviors to
enrich bottom lines. And models built by averaging data from entire populations
have sidelined minority and marginalized communities even as they are
disproportionately subjected to the technology’s impacts.

Over the years, a growing chorus of experts have argued that these impacts are
repeating the patterns of colonial history. Global AI development, they say, is
impoverishing communities and countries that don’t have a say in its development
—the same communities and countries already impoverished by former colonial
empires.

https://www.technologyreview.com/2022/04/22/1050394/artificial-intelligence-for-the-people/ 2/18
12/5/24, 8:11 A new vision of artificial intelligence for the people | MIT Technology Review

Peter-Lucas Jones (left) and Keoni Mahelona (right) attend an Indigenous AI workshop in 2019.
COURTESY PHOTO

This has been particularly apparent for artificial intelligence and language. “More is
more” has produced large language models with powerful autocomplete and text
analysis capabilities now used in everyday services like search, email, and social
media. But these models, built by hoovering up large swathes of the internet, are
also accelerating language loss, in the same way colonization and assimilation
policies did previously.

Only the most common languages have enough speakers—and enough profit
potential—for Big Tech to collect the data needed to support them. Relying on such
services in daily work and life thus coerces some communities to speak dominant
languages instead of their own.

🛠️ The Build issue is LIVE! Subscribe to explore technology's role in


constructing the future.

“Data is the last frontier of colonization,” Mahelona says.


https://www.technologyreview.com/2022/04/22/1050394/artificial-intelligence-for-the-people/ 3/18
12/5/24, 8:11 A new vision of artificial intelligence for the people | MIT Technology Review

In turning to AI to help revive te reo, the Māori language, Mahelona and Jones, who
is Māori, wanted to do things differently. They overcame resource limitations to
develop their own language AI tools, and created mechanisms to collect, manage,
and protect the flow of Māori data so it won’t be used without the community’s
consent, or worse, in ways that harm its people.
Now, as many in Silicon Valley contend with the consequences of AI development
today, Jones and Mahelona’s approach could point the way to a new generation of
artificial intelligence—one that does not treat marginalized people as mere data
subjects but reestablishes them as co-creators of a shared future.

Like many Indigenous languages globally, te reo Māori began its decline with
colonization.

After the British laid claim to Aotearoa, the te reo name for New Zealand, in 1840,
English gradually took over as the lingua franca of the local economy. In 1867, the
Native Schools Act then made it the only language in which Māori children could
be taught, as part of a broader policy of assimilation. Schools began shaming and
even physically beating Māori students who attempted to speak te reo.

In the following decades, urbanization broke up Māori communities, weakening


centers of culture and language preservation. Many Māori also chose to leave in
search of better economic opportunities. Within a generation, the proportion of te
reo speakers plummeted from 90% to 12% of the Māori population.

In the 1970s, alarmed by this rapid decline, Māori community leaders and activists
fought to reverse the trend. They created childhood language immersion schools
and adult learning programs. They marched in the streets to demand that te reo
have equal status with English.

In 1987, 120 years after actively supporting its erasure, the government finally
passed the Māori Language Act, declaring te reo an official language. Three years
later, it began funding the creation of iwi, or tribal, radio stations like Te Hiku
Media, to publicly broadcast in te reo to increase the language’s accessibility.

Many Māori I speak to today identify themselves in part by whether or not their
parents or grandparents spoke te reo Māori. It’s considered a privilege to have
grown up in an environment with access to intergenerational language
transmission.

This is the gold standard for language preservation: learning through daily exposure
as a child. Learning as a teen or adult in an academic setting is not only harder. A
textbook often teaches only a single, or “standard,” version of te reo when each iwi,
https://www.technologyreview.com/2022/04/22/1050394/artificial-intelligence-for-the-people/ 4/18
12/5/24, 8:11 A new vision of artificial intelligence for the people | MIT Technology Review

or tribe, has unique accents, idiomatic expressions, and embedded regional


histories.

Language, in other words, is more than just a tool for communication. It encodes a
culture as it’s passed from parent to child, from child to grandchild, and evolves
through those who speak it and inhabit its meaning. It also influences as much as it
is influenced, shaping relationships, worldviews, and identities. “It’s how we think
and how we express ourselves to each other,” says Michael Running Wolf, another
Indigenous technologist who’s using AI to revive a rapidly disappearing language.

“Data is the last frontier of


colonization."
Keoni Mahelona

To preserve a language is thus to preserve a cultural history. But in the digital age
especially, it takes constant vigilance to yank a minority language out of its
downward trajectory. Every new communication space that doesn’t support it
forces speakers to choose between using a dominant language and forgoing
opportunities in the larger culture.

“If these new technologies only speak Western languages, we’re now excluded from
the digital economy,” says Running Wolf. “And if you can’t even function in the
digital economy, it’s going to be really hard for [our languages] to thrive.”

https://www.technologyreview.com/2022/04/22/1050394/artificial-intelligence-for-the-people/ 5/18
12/5/24, 8:11 A new vision of artificial intelligence for the people | MIT Technology Review

Unlock access to the


May/June issue
Subscribe to explore technology's role in
constructing the future in our latest
edition.

SUBSCRIBE

With the advent of artificial intelligence, language revitalization is now at a


crossroads. The technology can further codify the supremacy of dominant
languages, or it can help minority languages reclaim digital spaces. This is the
opportunity that Jones and Mahelona have seized.

Long before Jones and Mahelona embarked on this journey, they met over
barbecue at their swimming club’s member gathering in Wellington. The two
instantly hit it off. Mahelona took Jones on a long bike ride. “The rest is history,”
Mahelona says.

In 2012, the pair moved back to Jones’s hometown of Kaitaia, where Jones became
CEO of Te Hiku Media. Because of its isolation, the region remains one of the most
economically impoverished of Aotearoa, but by the same token, its Māori
population is among the country’s best protected.

Over its 20-odd years of broadcasting history, Te Hiku had amassed a rich archive of
te reo audio materials. It includes gems like a recording of Jones’s own grandmother
Raiha Moeroa, born in the late 19th century, whose te reo remained largely
untouched by colonial influence.

Jones saw an opportunity to digitize the archive and create a more modern
equivalent of intergenerational language transmission. Most Māori no longer live
with their iwis and can’t rely on nearby kin for daily te reo exposure. With a digital
library, however, they’d be able to listen to te reo from bygone elders whenever and
wherever they wanted.

The local Māori tribes granted him permission to proceed, but Jones needed a place
to host the materials online. Neither he nor Mahelona liked the idea of uploading
https://www.technologyreview.com/2022/04/22/1050394/artificial-intelligence-for-the-people/ 6/18
12/5/24, 8:11 A new vision of artificial intelligence for the people | MIT Technology Review

them to Facebook or YouTube. It


would give the tech giants license
to do what they wanted with the
precious data.
(A few years later, companies
would indeed begin working with
Māori speakers to acquire such
data. Duolingo, for example,
sought to build language-learning
tools that could then be marketed
back to the Māori community.
“Our data would be used by the
very people that beat that
language out of our mouths to sell
it back to us as a service,” Jones
says. “It’s just like taking our land
and selling it back to us,”
Mahelona adds.)
COURTESY PHOTO
The only alternative was for Te
Hiku to build its own digital
hosting platform. With his engineering background, Mahelona agreed to lead the
project and joined as CTO.

The digital platform became Te Hiku’s first major step to establishing data
sovereignty—a strategy in which communities seek control over their own data in
an effort to ensure control over their future. For Māori, the desire for such
autonomy is rooted in history, says Tahu Kukutai, a cofounder of the Māori data
sovereignty network. During the earliest colonial censuses, after a series of
devastating wars in which they killed thousands of Māori and confiscated their land,
the British collected data on tribal numbers to track the success of the government’s
assimilation policies.

Data sovereignty is thus the latest example of Indigenous resistance—against


colonizers, against the nation-state, and now against big tech companies. “The
nomenclature might be new, the context might be new, but it builds on a very old
history,” Kukutai says.

In 2016, Jones embarked on a new project: to interview native te reo speakers in


their 90s before their language and knowledge was lost to future generations. He
wanted to create a tool that would display a transcription alongside each interview.

https://www.technologyreview.com/2022/04/22/1050394/artificial-intelligence-for-the-people/ 7/18
12/5/24, 8:11 A new vision of artificial intelligence for the people | MIT Technology Review

Te reo learners would then be able to hover on words and expressions to see their
definitions.
But few people had enough mastery of the language to manually transcribe the
audio. Inspired by voice assistants like Siri, Mahelona began looking into natural-
language processing. “Teaching the computer to speak Māori became absolutely
necessary,” Jones says.

Inside the Next Era of AI and


Hardware
Subscribe to join our
subscriber-only virtual event
SUBSCRIBE
on April 30 at 11:30 am ET
where we'll discuss AI
advancements in devices and
robotics.

SUBSCRIBE & SAVE

But Te Hiku faced a chicken-and-egg problem. To build a te reo speech recognition


model, it needed an abundance of transcribed audio. To transcribe the audio, it
needed the advanced speakers whose small numbers it was trying to compensate
for in the first place. There were, however, plenty of beginning and intermediate
speakers who could read te reo words aloud better than they could recognize them
in a recording.

So Jones and Mahelona, along with Te Hiku COO Suzanne Duncan, devised a
clever solution: rather than transcribe existing audio, they would ask people to
record themselves reading a series of sentences designed to capture the full range
of sounds in the language. To an algorithm, the resulting data set would serve the
same function. From those thousands of pairs of spoken and written sentences, it
would learn to recognize te reo syllables in audio.

The team announced a competition. Jones, Mahelona, and Duncan contacted every
Māori community group they could find, including traditional kapa haka dance
troupes and waka ama canoe-racing teams, and revealed that whichever one
submitted the most recordings would win a $5,000 grand prize.

https://www.technologyreview.com/2022/04/22/1050394/artificial-intelligence-for-the-people/ 8/18
12/5/24, 8:11 A new vision of artificial intelligence for the people | MIT Technology Review

The entire community mobilized. Competition got heated. One Māori community
member, Te Mihinga Komene, an educator and advocate of using digital
technologies to revitalize te reo, recorded 4,000 phrases alone.

Money wasn’t the only motivator. People bought into Te Hiku’s vision and trusted it
to safeguard their data. “Te Hiku Media said, ‘What you give us, we’re here as
kaitiaki [guardians]. We look after it, but you still own your audio,’” says Te Mihinga.
“That’s important. Those values define who we are as Māori.”

Within 10 days, Te Hiku amassed 310 hours of speech-text pairs from some
200,000 recordings made by roughly 2,500 people, an unheard-of level of
engagement among researchers in the AI community. “No one could’ve done it
except for a Māori organization,” says Caleb Moses, a Māori data scientist who
joined the project after learning about it on social media.
The amount of data was still small compared with the thousands of hours typically
used to train English language models, but it was enough to get started. Using the
data to bootstrap an existing open-source model from the Mozilla Foundation, Te
Hiku created its very first te reo speech recognition model with 86% accuracy.

COURTESY PHOTO

https://www.technologyreview.com/2022/04/22/1050394/artificial-intelligence-for-the-people/ 9/18
12/5/24, 8:11 A new vision of artificial intelligence for the people | MIT Technology Review

From there, it branched out into other language AI technologies. Mahelona, Moses,
and a newly assembled team created a second algorithm for auto-tagging complex
te reo phrases, and a third for giving real-time feedback to te reo learners on the
accuracy of their pronunciation. The team even experimented with voice synthesis
to create the te reo equivalent of a Siri, though it ultimately didn’t clear the quality
bar to be deployed.

Along the way, Te Hiku established new data sovereignty protocols. Māori data
scientists like Moses are still few and far between, but those who join from outside
the community cannot just use the data as they please. “If they want to try
something out, they ask us, and we have a decision-making framework based on our
values and our principles,” Jones says.

It can be challenging. The open-source, free-wheeling culture of data science is


often antithetical to the practice of data sovereignty, as is the culture of AI. There
have been times when Te Hiku has let data scientists go because they “just want
access to our data,” Jones says. It now seeks to cultivate more Māori data scientists
through internship programs and junior positions.

CH INA REPOR T
A weekly newsletter for behind-the-scenes insights on China and tech
news.

Enter your email

Sign up

By signing up, you agree to our Privacy Policy.

Te Hiku has since made most of its tools available as APIs through its new digital
language platform, Papa Reo. It’s also working with Māori-led organizations like the
educational company Afed Limited, which is building an app to help te reo learners
practice their pronunciation. “It’s really a game changer,” says Cam Swaison-
Whaanga, Afed’s founder, who is also on his own te reo learning journey. Students
no longer have to feel shy about speaking aloud in front of teachers and peers in a
classroom.

Te Hiku has begun working with smaller Indigenous populations as well. In the
Pacific region, many share the same Polynesian ancestors as the Māori, and their

https://www.technologyreview.com/2022/04/22/1050394/artificial-intelligence-for-the-people/ 10/18
12/5/24, 8:11 A new vision of artificial intelligence for the people | MIT Technology Review

languages have common roots. Using the te reo data as a base, a Cook Islands
researcher was able to train an initial Cook Islands language model to reach roughly
70% accuracy using only tens of hours of data.

“It’s no longer just about teaching computers to speak te reo Māori,” Mahelona says.
“It’s about building a language foundation for Pacific languages. We’re all struggling
to keep our languages alive.”

"Regardless of how widely spoken


they are, languages belong to a
people.”
Kathleen Siminyu

But Jones and Mahelona know there will come a time when they will have to work
with more than Indigenous communities and organizations. If they want te reo to
truly be ubiquitous—to the point of having te reo–speaking voice assistants on
iPhones and Androids—they’ll need to partner with big tech companies.
“Even if you have the capacity in the community to do really cool speech
recognition or whatever, you have to put it in the hands of the community,” says
Kevin Scannell, ​a computer scientist helping to revitalize the Irish language, who
has grappled with the same trade-offs in his research. “Having a website where you
can type in some text and have it read to you is important, but it’s not the same as
making it available in everybody’s hand on their phone.”

Jones says Te Hiku is preparing for this inevitability. It created a data license that
spells out the ground rules for future collaborations based on the Māori principle of
kaitiakitanga, or guardianship. It will only grant data access to organizations that
agree to respect Māori values, stay within the bounds of consent, and pass on any
benefits derived from its use back to the Māori people.

The license has yet to be used by an organization other than Te Hiku, and there
remain questions around its enforceability. But the idea has already inspired other
AI researchers, like Kathleen Siminyu of Mozilla’s Common Voice project, which
gathers voice donations to build public data sets for speech recognition in different
languages. Right now those data sets can be downloaded for any purpose. But last

https://www.technologyreview.com/2022/04/22/1050394/artificial-intelligence-for-the-people/ 11/18
12/5/24, 8:11 A new vision of artificial intelligence for the people | MIT Technology Review

year, Mozilla began exploring a license more similar to Te Hiku’s that would give
greater control to language communities that choose to donate their data. “It would
be great if we could tell people that part of contributing to a data set leads to you
having a say as to how the data set is used,” she says.

Margaret Mitchell, the former co-lead of Google’s ethical AI team who conducts
research on data governance and ownership practices, agrees. “This is exactly the
kind of license we want to be able to develop more generally for all different kinds
of technology. I would really like to see more of it,” she says.

In some ways, Te Hiku got lucky. Te reo can take advantage of English-centric AI
technologies because it has enough similarity to English in key features like its
alphabet, sounds, and word construction. The Māori are also a fairly large
Indigenous community, which allowed them to amass enough language data and
find data scientists like Moses to help make their vision a reality.

“Most other communities are not big enough for those happy accidents to occur,”
says Jason Edward Lewis, a digital technologist and artist who co-organizes the
Indigenous AI Network.

At the same time, he says, Te Hiku has been a powerful demonstration that AI can
be built outside the wealthy profit centers of Silicon Valley—by and for the people
it’s meant to serve.

https://www.technologyreview.com/2022/04/22/1050394/artificial-intelligence-for-the-people/ 12/18
12/5/24, 8:11 A new vision of artificial intelligence for the people | MIT Technology Review

Te Hiku Media receives a New Zealand innovation award for its language revitalization work.
COURTESY PHOTO

The example has already motivated others. Michael Running Wolf and his wife,
Caroline, also an Indigenous technologist, are working to build speech recognition
for the Makah, an Indigenous people of the Pacific Northwest coast, whose
language has only around a dozen remaining speakers. The task is daunting: the
Makah language is polysynthetic, which means a single word, composed of multiple
building blocks like prefixes and suffixes, can express an entire English sentence.
Existing natural-language processing techniques may not be applicable.
Before Te Hiku’s success, “we didn’t even consider looking into it,” Caroline says.
“But when we heard the amazing work they’re doing, it was just fireworks going off
in our head: ‘Oh my God, it’s finally possible.’”

Mozilla’s Siminyu says Te Hiku’s work also carries lessons for the rest of the AI
community. In the way the industry operates today, it’s easy for individuals and
communities to be disenfranchised; value is seen to come not from the people who
give their data but from the ones who take it away. “They say, ‘Your voice isn’t worth
anything on its own. It actually needs us, someone with a capacity to bring billions
together, for each to be meaningful,’” she says.

https://www.technologyreview.com/2022/04/22/1050394/artificial-intelligence-for-the-people/ 13/18
12/5/24, 8:11 A new vision of artificial intelligence for the people | MIT Technology Review

In this way, then, natural-language processing “is a nice segue into starting to figure
out how collective ownership should work,” she adds. “Because regardless of how
widely spoken they are, languages belong to a people.”

Read the rest of MIT Technology Review's series on AI Colonialism here.


by Karen Hao

Unlock access to the


May/June issue
Subscribe to explore technology's role in
constructing the future in our latest
edition.

SUBSCRIBE

D EEP DIVE

ARTIFICIAL INTELLIGENCE

https://www.technologyreview.com/2022/04/22/1050394/artificial-intelligence-for-the-people/ 14/18
12/5/24, 8:11 A new vision of artificial intelligence for the people | MIT Technology Review

Is robotics about to
have its own ChatGPT
moment?
What’s next for generative video Researchers are using generative AI
and other techniques to teach robots
OpenAI's Sora has raised the bar for AI moviemaking.
new skills—including tasks they
Here are four things to bear in mind as we wrap our heads
could perform in homes.
around what's coming.
By Melissa Heikkilä
By Will Douglas Heaven

Sam Altman says


helpful agents are
poised to become AI’s
killer function An AI startup made a hyperrealistic
Open AI’s CEO says we won’t need deepfake of me that’s so good it’s
new hardware or lots more training
data to get there.
scary
Synthesia's new technology is impressive but raises big
By James O'Donnell questions about a world where we increasingly can’t tell
what’s real.

By Melissa Heikkilä

https://www.technologyreview.com/2022/04/22/1050394/artificial-intelligence-for-the-people/ 15/18
12/5/24, 8:11 A new vision of artificial intelligence for the people | MIT Technology Review

STAY CONNECTED

Illustration by Rose Wong

Get the latest updates from


MIT Technology Review
Discover special offers, top stories, upcoming events, and more.

Enter your email

Privacy Policy

The latest iteration of a legacy


Founded at the Massachusetts Institute of Technology in 1899, MIT Technology Review is
a world-renowned, independent media company whose insight, analysis, reviews,
interviews and live events explain the newest technologies and their commercial, social and
political impact.

READ ABOUT OUR HISTORY

https://www.technologyreview.com/2022/04/22/1050394/artificial-intelligence-for-the-people/ 16/18
12/5/24, 8:11 A new vision of artificial intelligence for the people | MIT Technology Review

Advertise with MIT Technology Review


Elevate your brand to the forefront of conversation around emerging technologies that are
radically transforming business. From event sponsorships to custom content to visually
arresting video storytelling, advertising with MIT Technology Review creates opportunities
for your brand to resonate with an unmatched audience of technology and business elite.

ADVERTISE WITH US

About us

Careers

Custom content

Advertise with us

International Editions

Republishing

MIT Alumni News

Help & FAQ

My subscription

Editorial guidelines

Privacy policy

Terms of Service

Write for us

Contact us

© 2024 MIT Technology Review

https://www.technologyreview.com/2022/04/22/1050394/artificial-intelligence-for-the-people/ 17/18
12/5/24, 8:11 A new vision of artificial intelligence for the people | MIT Technology Review

https://www.technologyreview.com/2022/04/22/1050394/artificial-intelligence-for-the-people/ 18/18

You might also like