• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
Trending:
  • Kashmir
  • Elections
Saturday, June 6, 2026

Daily Times

Your right to know

  • HOME
  • Latest
  • Iran-Israel war
  • Gilgit Baltistan Election
  • Pakistan
    • Balochistan
    • Gilgit Baltistan
    • Khyber Pakhtunkhwa
    • Punjab
    • Sindh
  • World
  • Editorials & Opinions
    • Editorials
    • Op-Eds
    • Commentary / Insight
    • Perspectives
    • Cartoons
    • Letters to the Editor
    • Featured
    • Blogs
      • Pakistan
      • World
      • Lifestyle
      • Culture
      • Sports
  • Business
  • Sports
  • E-PAPER
    • Lahore
    • Islamabad
    • Karachi

When the Model Learns to Read Us

Published on: June 6, 2026 3:05 AM

June 6, 2026 by Asad Shabbir

For years, asking Urdu of a large language model was an exercise in mild humiliation. The Nastaliq script came back garbled, when it appeared at all; more often, the system silently swapped it for a stiff Arabic Naskh that no Pakistani reader would recognise as their own. Idioms were rendered literal. Anyone who tried to use these tools in their own tongue learned to switch to English, or to give up. The productivity dividend the rest of the world was quietly receiving was happening in a language the majority of us do not think in.

That is beginning to shift, and the shift matters more than it appears.

This year, a Pakistani researcher named Taimoor Hassan released Qalb, the largest language model built so far for Urdu, trained on nearly two billion tokens and benchmarked across more than seven international evaluation frameworks. Meta’s XLM-R 2.0 posted an eleven-point improvement in transfer to unseen scripts. The frontier labs are still English first, but they are not English only any longer. Models are starting to learn our morphology, our right-to-left script, and our literary register. This is not a research footnote but a quiet redistribution of access.

This year, a Pakistani researcher named Taimoor Hassan released Qalb, the largest language model built so far for Urdu, trained on nearly two billion tokens and benchmarked across more than seven international evaluation frameworks.

Voice changes this further. The Urdu-capable voice mode that arrived in tools like ChatGPT means that, for the first time, a Pakistani who cannot or does not want to type can ask a question aloud and hear an answer back in their own tongue. In a country where literacy is uneven but spoken Urdu is universal, that is a game-changer.

Consider what an Urdu-fluent model unlocks. A clerk in a tehsil office summarising case notes. A nurse in a rural BHU charts in Urdu instead of struggling through clinical English. An agricultural extension officer translating a soil report for a Sindhi farmer in real time. None of these professionals has been served by AI until now, because the tools were trained for someone else’s working language. A 2025 Stanford analysis estimated that countries whose primary languages are underrepresented in AI show roughly twenty per cent lower AI usage, attributable to language alone. That gap is not a market opening but a structural exclusion.

Urdu is spoken by over 200 million people. By population, that places it among the world’s top dozen languages. By representation in training data, it has historically sat far below that. The firms with the capability to build serious Urdu models are not the ones with the strongest commercial incentive to do so. A bank in California building the world’s best English assistant does not feel pressure from the absence of Urdu in its evaluation suite.

This is precisely the point at which a state becomes useful. When private firms have little commercial reason to invest in a language, a credible national buyer can change the calculation. The Prime Minister’s commitment, at Indus AI Week, of a billion dollars in artificial intelligence by 2030 is most consequential not as a number, but as a signal that there is a national commitment to the work of making Urdu, and our other regional languages, first-class citizens of the model. Corpus building, Nastaliq-aware tokenisation, evaluation benchmarks for legal and medical Urdu. These are not items a Silicon Valley product roadmap will reach on its own. They are essential to any country that wants its people to be understood by the machines they will increasingly live alongside.

There is a literary side to this that is easy to miss. Every previous shift in medium, from the printing press to broadcast radio to the open web, redrew who counted in public life. Languages early to a medium shaped it; languages late were shaped by it. Urdu’s relationship with print, in the nineteenth century, was a fight against being treated as an afterthought to Persian and English. Some of us have watched smaller versions of this fight more recently, when Nastaliq itself had to be coaxed onto a web that did not yet know how to render it. The same fight is now beginning in a new medium, and the stakes are higher because this medium does not just carry our writing. It interprets it, summarises it, and decides what counts as a competent rendering of it.

There is reason to be cautiously hopeful. Pakistani researchers are doing serious work, indigenous models are being trained, and national investment is, for the first time, oriented toward this question. The next decade will decide whether the language we read poetry in is also the language we instruct our machines in. That choice is being made now, in code and corpus and budget line, and the country has a narrow window in which to make it the right way.

The writer is a civil servant.

Filed Under: Op-Ed Tagged With: Learns, model, Read, US

Submit a Comment




Primary Sidebar




Latest News

Alexander Zverev eases past Jakub Mensik in French Open semifinals

Taylor to face Pili in Croke Park farewell

FIFA bans vuvuzelas from World Cup stadiums

France brush off Ivory Coast loss, call it timely World Cup reminder

Legendary boxer Muhammad Ali’s 10th death anniversary observed

Pakistan

JAAC declared proscribed party ahead of AJK polls on July 27

Fixed tax scheme for small retailers launched to raise Rs 50bn annually

Govt cuts petrol price by Rs 4 per litre, keeps diesel’s unchanged

Bilawal promises GB voters with land and job rights

Iran declares support for Hezbollah with wider peace deal in doubt

More Posts from this Category

Business

SBP’s ‘Go Cashless’ campaign saw Rs 34bn in digital transactions on Eid

Short-term inflation down by 0.56%

Saudi-Pak Business Council shows interest in infrastructure investment

‘Govt, allies united in efforts to craft people-centric budget’

Rupee records gain against US dollar

More Posts from this Category

World

CENTCOM space post signals wider US military footprint

US official delivers Trump’s “good hello” to Putin

NASA lifts ISS evacuation alert after leak

More Posts from this Category




Footer

Home
Lead Stories
Latest News
Editor’s Picks

Culture
Life & Style
Featured
Videos

Editorials
OP-EDS
Commentary
Advertise

Cartoons
Letters
Blogs
Privacy Policy

Contact
Company’s Financials
Investor Information
Terms & Conditions

Facebook
Twitter
Instagram
Youtube

© 2026 Daily Times. All rights reserved.

We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.