At the time of European colonisation, there were more than 3000 Indigenous languages across Africa and there are far fewer now estimated at less than 2200. Nonetheless, linguists are working to preserve what’s left in a digital archive. Documents from the 1950s to 80s are being scanned to ensure African communities retain a core part of their identity. In 100 years, researchers estimate that many of the world’s 7,000 languages could be extinct. Hundreds of years of oral storytelling will disappear in the space of a couple of generations.The knowledge and beauty locked up in these languages is irreplaceable. It goes beyond useful dot points about seasons and cultivation and local medicines to untranslatable words and to an entire cosmology. Every language is a multi-generational creative act.But the sounds, words and stories of all these languages are being captured online on community websites and video platforms.Cyberlinguists of the future will have to devise algorithms to decipher the recordings that were made before this mass extinction event.My friend working on such a project want to determine what language data must be uploaded to ensure that the world’s unwritten linguistic heritage is preserved and made intelligible to all future generations.Back in 2014 and 2015, he visited Hadzabe in Tanzania and Ogiek in Kenya to teach people to use a mobile phone software to record and interpret their languages.The software acts like a voice recorder, but it adds the ability to save and share phrase by phrase commentaries and translations.
Others experience the original recording with the interpretation.Him and the research team have recently taken similar software to Southern Africa countries of Namibia and Botswana and recorded about a 300,000 words of speech in two languages. Clearly, they can readily amass a large quantity of raw language data. But how can they analyse it all is the biggest challenge.In his work, the researcher has taken the keys for decipherment where parallel texts and in the case of unwritten languages bilingual are aligned in audio recordings.The researchers approach was made possible thanks to a recent advance in the processing of digital images. The method uses artificial neural networks and is commonly known as deep learning. Show a ten year old boy an image and ask him to point to the cat, and he does it in a split second. Algorithms can do this too and it’s what enables the researcher to search the web for images. For the child and the algorithms to identify the cat, they must first work out where to direct attention within the image. The question is whether the researchers can do the same for audio and if they can take individual words of the English transcription and correlate them with short stretches of audio in the source language of say Ogiek community in Kenya.He told me his initial experiments are showing promise and he has received more funding for his work. In his own words, the final step is to close the loop.After lining up source language words with English translations, his algorithm reports its confidence.He also need to exploit this information in order to search tens or hundreds of hours of untranslated audio, flag high value regions, then present these to people for translation.
Researcher is also planning to extending the current software with social media features. He holds the view that if the app could go mainstream, then speakers of the Africa’s disappearing languages would use it to record and translate their stories, guided by algorithms in knowing what to translate next. Speakers of the Africa’s disappearing languages are prioritising their survival. And they are adopting the mindset of the speakers of economically powerful languages especially English. Small languages are not relevant anymore and in some areas they are almost non existent. To preserve languages, then, researchers must go beyond their technical innovation to unearth the dominant culture.Due to the growing urbanisation of African population, the speakers of the Africa’s disappearing languages are now found in urban areas.He revealed that each story is recorded and shared using the software, generating public recognition and evoking pride for each storyteller and for each language. And each bilingual story-listener in the audience is motivated to use the app to record and interpret their parents’ stories for their children.In this way, researchers are able to return to the most ancient mode of social interaction and storytelling.However,this time it is captured on mobile devices, and algorithms are helping prioritise the translation effort.And hopefully,with such kind of efforts, Africa’s treasure languages will be sustained for at least another generation.