PESHAWAR: Hanifur Rahman, a noted research scholar, is presently working on collecting data sets for making Pashto language of artificial intelligence as the software requires more than 100 hours clips recording by end of ongoing year.

He said that Pashto Decentralised Autonomous Organisation was set up to build an autonomous, transparent and inclusive community for the adoption of Pashto as perfect tool on computer and AI.

He said that motivation behind the project was around 60 million Pashto speaking people across the globe with rich linguistic and cultural diversity but Pashto being with low availability of digital content online.

Prof Farkhanda Liaquat, the director of Pashto Academy, University of Peshawar, termed the project a revolutionary step towards digitisation of Pashto lingo corpus. She stated that the project would open a new window to the integrated knowledge and research regarding Pashto and Pashtuns at large.

Academy director terms the project a revolutionary step

Mr Rehman, when contacted by this scribe, said that unfortunately Pashto was not among 100 languages being used as an effective tool to access information but the initiative would soon adopt Pashto as computer and AI lingo. He said that efforts were being to build a Pashto digital assistant like Apple Siri or Hey Google as the app would enable Pashto speaking community worldwide to communicate with their mobile/computers via sound commands as well as written text.

He said that the main objective of the initiative was to enable Pashto speaking people across the globe to have an easy access to any kind of information in Pashto.

Mr Rahman stated that the purpose of the organisation was to create a worldwide community of Pashto linguists, journalists, teachers, translators, performers, artists, literati and hi-tech specialists to contribute to the scholastic cause.

“For the software to work for all Pashtuns including young, women and men speaking any particular accent of Pashto as many as possible, we need sample voices to bring Pashto to one of the most significant AI dataset project. Pashtun users may go to the link https://commonvoice.mozailla.org/ps and record at least 10 Pashto sentences in her/his own voice each day for a week or as long one can,” he said.

Mr Rahman said that Pashto Automatic Speech Recognition System had already been developed by his team. However, he regretted that due to lack usable voice dataset, it made a few errors and it could only be enhanced with further data base. “For an open-source ChatGPT like model named ‘Aya’ once released one could talk/chat in Pashto with that large model AI,” he explained.

He said that he wanted completion of translation of common voice portal to Pashto to promote it from a low resource to web-rich language and to devise a unified approach towards Pashto language corpus.

Published in Dawn, May 20th, 2024

Opinion

Editorial

Bilateral progress
Updated 18 Oct, 2024

Bilateral progress

Dialogue with India should be uninterruptible and should cover all sticking points standing in the way of better ties.
Bracing for impact
18 Oct, 2024

Bracing for impact

CLIMATE change is here to stay. As Pakistan confronts serious structural imbalances, recurring natural calamities ...
Unfair burden
18 Oct, 2024

Unfair burden

THINGS are improving, or so we have been told. Where this statement applies to macroeconomic indicators, it can be...
Successful summit
Updated 17 Oct, 2024

Successful summit

Platforms like SCO present an opportunity for states to set aside narrow differences.
Failed tax target
17 Oct, 2024

Failed tax target

THE government’s plan to document retailers for tax purposes through its ‘voluntary’ Tajir Dost Scheme appears...
More questions
17 Oct, 2024

More questions

THE alleged rape of a student at a private college in Lahore has sparked confusion, social media campaigns, ...