With the advent of smart keyboards, the Urdu script has arrived in mobile phones. However, its application remains tedious at best.

Most people resort to writing in Urdu by using the English script, which is also known as writing in Roman Urdu.

The Data Science Lab (DSL) at Lahore’s Information Technology University has made an application called the Urdu-Hindi Dictionary which can help overcome this language barrier and offers a powerful custom search that will enable you to find the translation of any word in Arabic, Urdu, Hindi or English.

More than 542 million people on the planet speak either Hindi or Urdu which makes them one of the most common languages spoken.

“Due to its popularity, Roman Urdu and Roman Hindi is emerging as a language so the Urdu-Hindi Dictionary has a lot of potential,” says Dr. Faisal Kamiran, director of the Data Science Lab.

“Our research is mainly about understanding this Roman Urdu text which is used very frequently in SMS, Facebook, Tweets and other forms of social media,” says Dr. Kamiran.

Screenshot of the Urdu-Hindi dictionary.
Screenshot of the Urdu-Hindi dictionary.

“It becomes a bit more complicated because there are several variations of the same word in Roman Urdu. The same word ‘kursi’ can also be written as kursee or even kurrsi.”

The application can be used to translate from Roman Urdu to English, Roman Hindi to English, and even English to Urdu, Hindi and Arabic.

Each category provides you with the translated word along with its definition, synonyms and verbs.

The Urdu-Hindi Dictionary offers a simple user interface (UI) with more than 10 ten different categories for user convenience such as fruits, vegetables, animals, profession, elements, and tools.

In order to retain the user’s interest, the application offers timed quizzes to test their vocabulary.

The application is currently available for download on Google Store.

It was relaunched on 11th November, 2016 and has already had 30,000 downloads on Android and 8,000 downloads on iOS.

According to the Data Science Lab, there were several challenges involved in making this application.

“One complication was that there is no standard for writing in Roman Urdu and quite a few variations of the same word existed,” says Dr. Kamiran.

Also read: No longer lost for words: How researchers rediscovered the mother of all mother tongues

The members of the Data Science Lab wrote a research paper that handled this variation and came up with an algorithm that would understand that all these different spellings still mean the same word.

“A person using this application could write the word “kursi” with any slight difference in spelling and the application would still understand what the user is trying to say and give them the translation of the word in their selected language,” he added.

The Citizen Feedback Monitoring Program (CFMP), a project of the Punjab Information Technology Board, is looking closely into this application for future use.

CFMP is a popular initiative by the Punjab government to fight corruption and get feedback from citizens who are utilising public services such as property registration or getting driver’s licenses.

When this program gets feedback from citizens, a lot of times the responses are in Roman Urdu.

The CFMP team had to spend a lot of money checking the feedback and trying to figure out what is being said in the SMS.

In addition to the long processing time, CFMP was spending Rs6 million per annum to classify all the feedback that they were getting into different categories.

The amount of time and resources being spent was a major hurdle in the sustainability of the program. In 2015, CFMP had to curtail the program and look for other means of connectivity such as robocalling.

However, with the use of this application, which has an accuracy of more than 71 per cent, translation can become easier.

According to the Analyst Team at CFMP, if the Urdu-Hindi app is implemented instead of the existing infrastructure, it could possibly save CFMP Rs30m over the next five years.

An application that can translate Roman Urdu into English can serve as a teaching tool and can be used to help empower the low-literate population.

The Urdu-Hindi dictionary also has many applications in social media mining.

Screenshot of the Urdu-Hindi dictionary.
Screenshot of the Urdu-Hindi dictionary.

A language that most people are comfortable using can be analysed and used in predicting election polls, sentiment analysis, review of emerging topics and analysis of government or private projects and services.

According to Dr. Kamiran, standardising a language presents many difficulties but someone will have to start doing it and the Urdu-Hindi dictionary is a step in the right direction.

“We also need to mine Roman Urdu because otherwise you are losing the language of the people,” says Dr. Kamiran. “Currently, we have a stable lexicon with us but our aim in the future is to go even further and make a translator.”

The article originally appeared on MIT Tech Review Pakistan and has been reproduced with permission.

Opinion

Editorial

Strange claim
Updated 21 Dec, 2024

Strange claim

In all likelihood, Pakistan and US will continue to be ‘frenemies'.
Media strangulation
Updated 21 Dec, 2024

Media strangulation

Administration must decide whether it wishes to be remembered as an enabler or an executioner of press freedom.
Israeli rampage
21 Dec, 2024

Israeli rampage

ALONG with the genocide in Gaza, Israel has embarked on a regional rampage, attacking Arab and Muslim states with...
Tax amendments
Updated 20 Dec, 2024

Tax amendments

Bureaucracy gimmicks have not produced results, will not do so in the future.
Cricket breakthrough
20 Dec, 2024

Cricket breakthrough

IT had been made clear to Pakistan that a Champions Trophy without India was not even a distant possibility, even if...
Troubled waters
20 Dec, 2024

Troubled waters

LURCHING from one crisis to the next, the Pakistani state has been consistent in failing its vulnerable citizens....