DAWN.COM

Today's Paper | October 18, 2024

Published 20 May, 2024 07:12am

London-based scholar wants to make Pashto AI language

PESHAWAR: Hanifur Rahman, a noted research scholar, is presently working on collecting data sets for making Pashto language of artificial intelligence as the software requires more than 100 hours clips recording by end of ongoing year.

He said that Pashto Decentralised Autonomous Organisation was set up to build an autonomous, transparent and inclusive community for the adoption of Pashto as perfect tool on computer and AI.

He said that motivation behind the project was around 60 million Pashto speaking people across the globe with rich linguistic and cultural diversity but Pashto being with low availability of digital content online.

Prof Farkhanda Liaquat, the director of Pashto Academy, University of Peshawar, termed the project a revolutionary step towards digitisation of Pashto lingo corpus. She stated that the project would open a new window to the integrated knowledge and research regarding Pashto and Pashtuns at large.

Academy director terms the project a revolutionary step

Mr Rehman, when contacted by this scribe, said that unfortunately Pashto was not among 100 languages being used as an effective tool to access information but the initiative would soon adopt Pashto as computer and AI lingo. He said that efforts were being to build a Pashto digital assistant like Apple Siri or Hey Google as the app would enable Pashto speaking community worldwide to communicate with their mobile/computers via sound commands as well as written text.

He said that the main objective of the initiative was to enable Pashto speaking people across the globe to have an easy access to any kind of information in Pashto.

Mr Rahman stated that the purpose of the organisation was to create a worldwide community of Pashto linguists, journalists, teachers, translators, performers, artists, literati and hi-tech specialists to contribute to the scholastic cause.

“For the software to work for all Pashtuns including young, women and men speaking any particular accent of Pashto as many as possible, we need sample voices to bring Pashto to one of the most significant AI dataset project. Pashtun users may go to the link https://commonvoice.mozailla.org/ps and record at least 10 Pashto sentences in her/his own voice each day for a week or as long one can,” he said.

Mr Rahman said that Pashto Automatic Speech Recognition System had already been developed by his team. However, he regretted that due to lack usable voice dataset, it made a few errors and it could only be enhanced with further data base. “For an open-source ChatGPT like model named ‘Aya’ once released one could talk/chat in Pashto with that large model AI,” he explained.

He said that he wanted completion of translation of common voice portal to Pashto to promote it from a low resource to web-rich language and to devise a unified approach towards Pashto language corpus.

Published in Dawn, May 20th, 2024

Read Comments

As SCO summit concludes in Islamabad, PM Shehbaz urges investment for regional connectivity Next Story