A group of scientists in Africa is working to create advanced artificial intelligence tools specifically designed for African languages.
According to Masakhane Research Foundation’s Kathleen Siminyu, widely used AI technologies such as text-generating ChatGPT and voice-activated Siri are inaccessible to billions of people who do not speak mainstream languages such as English, French and Spanish hence the necessity for greater inclusivity and representation in language technology advancement.
“It doesn’t make sense to me that there are limited AI tools for African languages. Inclusion and representation in the advancement of language technology is not a patch you put at the end — it’s something you think about upfront,” said Ms Siminyu.
Holding significant value
Many of these tools rely on a field of AI called natural language processing, a technology that enables computers to understand human languages. Computers can master a language through training, where they pick up on patterns in speech and text data. They fail when data in a particular language is scarce, as is the case with African languages.
To fill the gap, the research team first identified key players involved in developing African language tools and explored their experience, motivation, focuses, and challenges. These people include writers and editors who create and curate content, as well as linguists, software engineers, and entrepreneurs who are crucial in establishing the infrastructure for language tools.
Interviews with the key players revealed four crucial considerations in designing African language tools, including Africa’s multilingual society dilemma that necessitates acknowledging the impact of colonisation whereby indigenous languages hold significant cultural value alongside their pivotal role in facilitating societal participation across education systems, political landscapes and economic spheres.
Second, there is a need to support African content creation. This includes building basic tools such as dictionaries, spell checkers, and keyboards for African languages and removing financial and administrative barriers for translating government communications to multiple national languages, which includes African languages.
While also nurturing collaboration between linguistics and computer science, the intersection at which these two fields can lead to human-centred innovative solutions.
There is, however, the need for adherence to ethical practices and prioritise community involvement throughout data collection and use processes.
Next, the team will address barriers that may hinder people’s access to the technology. The study should serve as a roadmap to help develop a wide range of language tools, from translation services to misinformation-catching content moderators.
The findings may also pave the way to preserve indigenous African languages.
“I would love for us to live in a world where Africans can have as good quality of life and access to information and opportunities as somebody fluent in English, French, Mandarin, or other languages,” says Siminyu.
“There is a growing number of organisations working in this space, and this study allows us to coordinate efforts in building impactful language tools,” says Siminyu. “The findings highlight and articulate what the priorities are, in terms of time and financial investments.”