Bringing Digital Indic Script Standardization Back To India

Share this article

This AI generated Text-to-Speech widget generated by Reverie Vachak.

In our previous series on character standardization & the Government of India’s language mandate for mobile devices, we saw the importance of creating standards for Indic script sets.

Background

The scope of this mandate was envisioned as covering display support for 22 Indian languages with input in at least 2 Indian languages on all mobile devices sold in India. This was, as we saw, to ensure that hundreds of millions of Indian language internet users could now access the Indian internet with ease, a right previously denied to them.

The Government of India’s Indian language mandate for mobile devices was worked on by the Bureau of Indian Standards, in consultation with language experts and policy makers. These officials and experts came together to ensure that the mandate would cover the most relevant and most widely used characters – the essential characters – for each script, for each language.

MeiTY, another government body, was responsible for the implementation and enforcement of this mandate.

Industry implications

Apart from these technical and policy level reasons, factors that directly affect the usability of these digital devices, there are important industry level implications for this mandate. To understand these implications better, we’ll have to take a step back and look at the history of the Indian internet.

Earlier, when the Indian internet was still in the process of emerging, different devices would support different languages, and different character sets for the same language. This was done because there were no guidelines available on internal script grammar rules for the industry to follow. Devices however, went on being produced, even in the absence of guidelines to follow.

The result was that text written on one device would not render as intended on another device. Now, imagine entire websites, an entire ecosystem at the mercy of these rendering issues, and you’ll come close to imagining the chaos that Indian users were greeted with. It’s not surprising that, instead of trying to make sense of the mess of characters and empty boxes (or ‘tofu characters’) in front of them, they chose to switch to English instead, even if their grasp of the language was rudimentary.

As you can imagine, this meant that the Indian internet was significantly disadvantaged right from the beginning, meaning that its potential stakeholders could not do much for it. Their hands were tied.

Character sets

Text support as defined by the mandate is based on the adherence to defined basic character sets for each language, character sets that include all the necessary characters in active use by speakers of language, and exclude archaic & extraneous characters that can only add confusion. The end result is a standard that leaves little ambiguity and ensures that Indic language text can be viewed across devices.

However, Unicode’s process of standardizing and defining these characters overlooked and ignored the way native speakers perceive and understand their own scripts.

Unicode sets for Indic scripts include numerous extraneous characters, including characters that are similar looking forms of characters formed through combination, archaic, obscure characters that native speakers (apart from perhaps specialists) would be unaware of, and junk characters that are illegal combinations, combinations that are not permitted by the script’s own inherent rules.

(Read more here)

Reclaiming Indic script standards

The implementation of this mandate has significant implications for the Indian internet, since 99% of Indian language internet users access the internet from mobile devices.

In addition, this mandate ensures that Indian languages are internationalization-ready, in accordance with the W3C’s own objectives and vision for seeing language & character standardization across the world. The Government’s mandate will build a sturdy foundation for the growth of the Indian language internet as well as the adoption of digital internationalisation.

An additional objective of this mandate is to reclaim the standards for Indian languages and ensure that Indian experts and policy makers have control over them. Mobile devices by foreign phone manufactures have a large presence in India, with Chinese devices alone having a 53% market share as of 2018.

These companies did not always have the Indian language internet as a priority, and as a result would haphazardly release and use conflicting standards and character sets for their devices.

This undermines the Government’s vision to standardize Indian language text online, and disrupts the smooth dissemination of content and information, leading to repercussions for Indian language internet users and impacting how they experience the internet.

In short, this mandate brings the control of Indian scripts back to its primary stakeholders – the Indian people, the same people who live, breathe, and speak these many languages.

Share this article
Subscribe to Reverie's Blogs & News

The latest news, events and stories delivered right to your inbox.

You may also like

Reverie Language Technologies Limited, a leader in Indian language localisation and user engagement technology solutions for over a decade, is working towards a vision to create Language Equality on the Internet.

Reverie’s language practice is dedicated to helping clients future-proof their rapidly expanding content by combining cutting-edge technologies like Artificial Intelligence and Neural Machine Translation (NMT) with best-practice approaches for optimizing content and business processes.

Copyright © 2024 Reverie Language Technologies Limited All Rights Reserved. 

SUBSCRIBE TO REVERIE

The latest news, events and stories delivered right to your inbox.