How Pakistan’s spelling mistakes could lead to fraud

Published October 22, 2015
Spelling mistakes pose serious hardships in fraud detection when the same person walks around with multiple identification documents made possible by variant spellings. —File
Spelling mistakes pose serious hardships in fraud detection when the same person walks around with multiple identification documents made possible by variant spellings. —File

A rose by any other name would smell as sweet. But, will it smell as sweet with another spelling?

Spelling matters in writing, and even more so in record-keeping. But not in Pakistan, where misspelling nouns is a national sport.

Elsewhere, variant spellings of the same word are common. However, variant spellings of the name of the same person or town are uncommon. In Pakistan, however, it’s a different “storey.”

I came face-to-face with our dyslexia when I tried to obtain my father's death certificate. His first name was Ajaz. I spelled it for the benefit of the administration assistants working at the hospitals and various municipal offices.

They instead took it merely as a suggestion and issued documents with their preferred spellings of my deceased father’s name. These included Ijaz, Ejaz, and Aijaz, to name a few.


It may sound unimportant, but spelling mistakes can impose significant economic costs in the world that is increasingly relying on analytics.


Take retail-banking fraud as an example. An individual can have fraudulent documents issued with variant spellings, e.g., Umer and Umar. When new documentation, such as a bank account, is created using the existing documents in Urdu, applicants can use any spelling of their liking in English.

This would pose serious hardships in fraud detection, when the same person walks around with multiple identification documents made possible by variant spellings.

I decided to test my hypothesis about the spelling challenges of Pakistanis. It so happened that I got my hands on a data set that was available online for a brief time. The data set included the names and other details of 15,176 members of the armed forces who had died in the line of duty. Also included was the city district of the deceased’s origin.

A quick analysis of the data set revealed that spelling Campbellpur has been a real change for Pakistanis. No wonder they changed its name to Attock because the nation was stuck getting the spellings right for the city named after Sir Colin Campbell.

I used Open Refine software to deal with misspelled cities. The raw data set listed 450 cities as the city of origin. After running several clustering algorithms to identify and correct misspelled names, I was able to reduce the number of cities to 204. So imagine, almost every town in the database, relatively speaking, had a misspelled variant.

It came as no surprise that the most frequently listed city of origin, 1,393 to be precise, of the deceased soldiers was Rawalpindi.

The garrison city is at the centre of the arid districts that have historically been for the lack of agriculture the primary catchment for military’s recruitment. Following Rawalpindi was Poonch (Punch), a small town in Azad Jammu and Kashmir. What is interesting about this data is the disproportionately large number of deceased soldiers, relative to the town’s population, belonging to AJK (Azad Kashmir). This, however, is a subject for another blog.

When it comes to typographical errors, I find the space between two words in a composite noun to be the biggest challenge. It generates several variants of the same name. For instance, Mirpur and Mir Pur are the most common alternative spellings of the city of which only one is correct.

Another challenge is the abbreviations used at times and not the other. For instance, D I Khan and Dera Ismail Khan refer to the same town in Khyber Pakhtunkhwa. If the data set is inconsistent, where names are spelled out at times and not the others, one would have a tough time analysing the data.

It turns out Pakistanis are not the only ones struggling with spellings. I live in Mississauga, a suburban town bordering Toronto. Note that the town’s name includes four 's', which also results in several misspelled variants.

Over in the UK, the BBC reported that spelling mistakes cost millions in lost online sales. Charles Duncombe, an online entrepreneur, analysed website sales and found that “a single spelling mistake can cut online sales in half”.

Researchers have also studied typographical errors and their impact on the economy. Prudential Insurance Company, for instance, faced the prospect of losing millions after a typo in the mortgage documents. Other challenges arise from the different ways a word is spelled in the British and American English, e.g., color versus colour.

Given that most town and person names in Pakistan are not native words or expressions in English, a lack of consistency in spellings will always be a challenge.

This, however, should not be a source of confusion, and certainly should not open doors for graft and organised crime where the miscreants either hide behind deliberately misspelled names or avoid conviction because of it.

Opinion

Editorial

Kurram atrocity
Updated 22 Nov, 2024

Kurram atrocity

It would be a monumental mistake for the state to continue ignoring the violence in Kurram.
Persistent grip
22 Nov, 2024

Persistent grip

An audit of polio funds at federal and provincial levels is sorely needed, with obstacles hindering eradication efforts targeted.
Green transport
22 Nov, 2024

Green transport

THE government has taken a commendable step by announcing a New Energy Vehicle policy aiming to ensure that by 2030,...
Military option
Updated 21 Nov, 2024

Military option

While restoring peace is essential, addressing Balochistan’s socioeconomic deprivation is equally important.
HIV/AIDS disaster
21 Nov, 2024

HIV/AIDS disaster

A TORTUROUS sense of déjà vu is attached to the latest health fiasco at Multan’s Nishtar Hospital. The largest...
Dubious pardon
21 Nov, 2024

Dubious pardon

IT is disturbing how a crime as grave as custodial death has culminated in an out-of-court ‘settlement’. The...