Close Menu
  • Home
  • Bilingual
  • Children’s Books
  • Children’s Games
  • Africa
  • Spanish
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
Facebook X (Twitter) Instagram Pinterest WhatsApp
Trending
  • I Don’t Think So!, by K. Joyner
  • Dolphin Sanctuary Kayak Tours | July School Holidays
  • 5 Amazing Books That Help Shy Kids Blossom With Confidence
  • Brighton Winter Solstice | Jetty Road | 21 Jun 2025
  • Turn Everyday Routines Into Learning Moments
  • Peppa Pig Baby Announcement | Peppa Meets the Baby | In Cinemas 30 May 2025
  • The Tales of Charlie Wags: Paris
  • Lights at Mawson | 11-13 Jul 2025
Tuesday, June 10
Facebook X (Twitter) Instagram Pinterest WhatsApp
Cat Fish WaiterCat Fish Waiter
  • Home
  • Bilingual
  • Children’s Books
  • Children’s Games
  • Africa
  • Spanish
  • About Us
  • Contact Us
Cat Fish WaiterCat Fish Waiter
Home » When A.I. Fails the Language Test, Who Is Left Out of the Conversation?
Africa

When A.I. Fails the Language Test, Who Is Left Out of the Conversation?

catfishBy catfishJuly 30, 2024No Comments6 Mins Read
Facebook Twitter LinkedIn Telegram Pinterest Tumblr Reddit Email
Share
Facebook Twitter LinkedIn Pinterest Email
Ads

Stanford researchers gave a popular A.I. chatbot a language test.

They asked the bot in Vietnamese to write a traditional poem in the form known as “song thất lục bát” that follows a pattern of lines made up of seven, seven, six, then eight words. When the bot spit out an answer, it wrote a poem, but didn’t follow the format.

The team tried a different prompt, asking what the proper Vietnamese word was for a mother’s younger brother, and it responded with the words for a father’s younger and older siblings.

Ads

These flaws are not unique to Claude 3.5, the chatbot by the artificial intelligence company Anthropic that the researchers queried, but they illustrate some of the ways in which artificial intelligence can get language outside of standard American English wrong.

While the use of A.I. has exploded in the West, much of the rest of the world has been left out of the conversation since most of the technology is trained in English. A.I. experts worry that the language gap could exacerbate technological inequities, and that it could leave many regions and cultures behind.

A delay of access to good technology of even a few years, “can potentially lead to a few decades of economic delay,” said Sang Truong, a Ph.D. candidate at the Stanford Artificial Intelligence Laboratory at Stanford University on the team that built and tested a Vietnamese language model against others.

The tests his team ran found that A.I. tools across the board could get facts and diction wrong when working with Vietnamese, likely because it is a “low-resource” language by industry standards, which means that there aren’t sufficient data sets and content available online for the A.I. model to learn from.

Low-resource languages are spoken by tens and sometimes hundreds of millions of people around the world, but they yield less digital data because A.I. tech development and online engagement is centered in the United States and China. Other low-resource languages include Hindi, Bengali and Swahili, as well as lesser-known dialects spoken by smaller populations around the world.

An analysis of top websites by W3Techs, a tech survey company, found that English makes up over 60 percent of the internet’s language data. While English is widely spoken globally, native English speakers make up about 5 percent of the population, according to Ethnologue, a research organization that collects language data. Mandarin and Spanish are other examples of languages with a significant online presence and reliable digital data sets.

Academic institutions, grass-roots organizations and volunteer efforts are playing catch-up to build resources for speakers of languages who aren’t as well represented in the digital landscape.

Lelapa AI, based in Johannesburg, is one such company leading efforts on the African continent. The South African-based start-up is developing multilingual A.I. products for people and businesses in Africa.

“I think it’s such a dangerous concept that people need to assimilate to a different culture and have to take on different cultures in order to have access to progress,” said Pelonomi Moiloa, chief executive and co-founder of Lelapa AI.

The company is less focused on scale than on community-specific solutions, she said. It is crafting its products to be more resource-efficient, cost effective and to be used primarily on speech-to-speech communication in the local languages, which make the technology more accessible to African people.

“Large companies like Google, Apple, OpenAI, for example, have not necessarily trained their models for tools that serve these markets,” Chinasa T. Okolo, a fellow at the Center for Technology Innovation at the Brookings Institution, said about communities with low-resource languages. “They don’t provide enough market value for them to do so.”

A communications officer for Open AI said that the company releases A.I. systems steadily to more groups of people, and that its latest model supports over 50 languages. Google pointed to its projects focusing on A.I. development for underrepresented languages, including a “1,000 languages” initiative, announced in 2022, to build language models for the 1,000 most-spoken languages in the world. Apple said it, too, has developed products to support a range of languages.

The consequences of the language gap in A.I. tools can be numerous. The technology has potential to increase productivity and change workplaces, but without reliable data in local languages, some regions of the world could miss out on the economic benefits, according to A.I. experts. The exclusion of low-resource languages could also lead to cultural bias in A.I. products.

A.I.’s lack of knowledge in low-resource languages has the potential to raise security concerns as well. Sara Hooker, the head of Cohere for AI, the nonprofit research arm of the start-up Cohere, said some users could bypass the safety measures of A.I. products by asking questions in other languages.

“You can easily, for example, still get very dangerous instructions about how to build a bomb just by switching to a different language,” Ms. Hooker said.

Ms. Hooker’s team at Cohere for AI launched a broad model and data set for multilingual A.I., called Aya, in February. It includes 101 languages and relies on the volunteer efforts of over 3,000 independent researchers. But Ms. Hooker said that even a project that big wasn’t a solution to the language lag.

She said that in A.I., the industry is often focused on the latest model and how it performs, “but in this particular topic, it’s also reshaping the ecosystem as a whole,” adding that the gap will widen unless researchers from around the world are involved as A.I. develops further and at a rapid pace.

While the issue is obvious for many in the industry, the solutions are complicated. Large-language models, or L.L.M.s, which are used in technology to communicate in human language, require large banks of high quality data, often collected from the internet and not easily accessible for low-resource languages. Mr. Truong equated building an L.L.M. to teaching a newborn: There may be 20,000 books with lessons in English, but there are just five in Vietnamese.

The disparity is so large in some regions that governments have stepped in to back efforts to build their own language models. This spring, the Nigerian government promised to back the tech start-up Awarri in building a model for local languages. Both Iceland’s government and the Welsh government work with OpenAI to improve ChatGPT’s understanding of the native languages there.

“The language gap is really important in terms of access, but it is also just really important to help re-energize people’s sense of pride in who they are, where they come from,” Ms. Moiloa of Lelapa AI said.

Sanmi Koyejo, the head of Stanford Trustworthy A.I. Research at Stanford University, said including more languages in all A.I. products is also important to capture cultural nuances and diverse perspectives.

Dr. Koyejo pointed to a Stanford study that fed questions from Pew Research to A.I. chatbots to gauge their biases. He said the chatbot’s answers most closely matched views of people in California, where much of the technology is being developed.

“Culture is a big aspect of this,” he said. “You lose something if you’re only seeing the internet slash U.S.-centric version of the world.”

Ads
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
catfish
  • Website

Related Posts

In African Publishing, ‘There Is a Renaissance Going On’

January 20, 2025

Kenya: Five abducted young men freed amid uproar

January 18, 2025

Chad FM accuses Macron of ‘contemptuous attitude towards Africa’

January 18, 2025

How Antony Blinken, America’s Top Diplomat, Became the Secretary of War

January 18, 2025

Ghana presidential inauguration: Mahama returns as leader

January 18, 2025

Editor of Somalia’s first all-female media on challenging gender stereotypes

January 18, 2025

Comments are closed.

Ads
Stay In Touch
  • Facebook
  • Twitter
  • Pinterest
  • Instagram
  • YouTube
  • Vimeo
Our Picks

I Don’t Think So!, by K. Joyner

June 10, 2025

Dolphin Sanctuary Kayak Tours | July School Holidays

June 10, 2025

5 Amazing Books That Help Shy Kids Blossom With Confidence

June 10, 2025

Brighton Winter Solstice | Jetty Road | 21 Jun 2025

June 9, 2025
Ads
About Cat Fish Waiter
About Cat Fish Waiter

Cat Fish Waiter is a book that kids will love to read and listen. An interesting and engaging book that encourages children to think big.
Email Us: topkidsbooks@outlook.com
Contact: +1-484-378-5779

Latest Posts

I Don’t Think So!, by K. Joyner

June 10, 2025

Dolphin Sanctuary Kayak Tours | July School Holidays

June 10, 2025
Categories
  • Africa
  • Bilingual
  • Cat Fish Waiter
  • Children's Books
  • Children's Games
  • Spanish
Facebook X (Twitter) Instagram Pinterest WhatsApp
  • Home
  • About Us
  • Contact Us
  • Privacy Policy
© 2025 CatFishWaiter || Designed by BizieBiz

Type above and press Enter to search. Press Esc to cancel.