The Case For Indic Search

Search on websites and apps is one of those things we use so much and so often, we take it for granted. But have you ever heard of Indic search?

Search has become an indispensable part of our lives online. Content discovery and content recommendations for example, rest almost entirely on search. Things have only expanded from there. We use search to find directions. Restaurants. Videos. Shopping. Even insurance plans!

It’s impossible to imagine the internet without search.

How search works 

A lot of what goes into making search great at what it does actually goes on behind the scenes. When you type something, the search engine looks through a list of pre-indexed pages for relevant results – results that match your search terms the closest, taking into account common variations of search terms used – and gives you a set of results.

You can then pick, depending on your requirement, from the list of results in front of you.

Search, like much of the what the modern internet is built on, was originally built in English, for English language content.

Needless to say, the world has moved since then, and so has the internet. English is no longer the only language used online.

Over the years, Indian language users have come to form the biggest chunk of India’s netizens. As the number of Indian language users continues its rapid rise (90% of new internet users over the next 5 years will be Indian language users), the amount of digital content in Indian languages will see a similar increase.

To ensure that search in Indic languages is able to deliver the same amount of functionality available to English language users, an Indic search ecosystem needs to be built.

With more Indian language internet users comes more digital Indic content. With more content and more users comes a greater number of search queries for more content.

Basically, when more users are online, they create and share more content and are constantly on the lookout for more content. This drives search volumes higher and higher.

This has wider implications. Search engines can be integrated with websites and apps too. As more and more platforms support more content, search becomes a means of navigating through the options to find what you’re looking for.

Anyone who runs an app or website knows just how critical proper search is. There’s a reason companies constantly tweak their search engines just to give them that extra edge needed in making them perform better.  Integrating Indic search is something else that can give a major boost to the performance of a platform’s search functionality, with direct results in the company’s own bottom line.

Some verticals with an especially high reliance on search include banking, e-learning, entertainment, and most importantly, e-commerce. Many, if not most of these verticals also correspond to high frequency, high content density verticals that have highly engaged, frequently returning users.

E-commerce, moreso than other verticals, requires a solid, highly functioning search ecosystem to ensure that customers find what they’re looking for, right away. It drives product discovery, and is directly tied to revenue.

The better the search system, the easier products and services can be found, and the more customers can convert and purchase these products and services.

A functioning, stable Indic search ecosystem comes into play here.

Ensuring consistent text encoding

Search in Indic language comes with its own challenges, challenges that a result of the inherently nonlinear nature of Indic scripts.

A prerequisite for a smooth search experience is proper indexing.

The way Indic script rendering works makes it possible for characters to be formed differently, but look the same. This is similar to how l (lowercase l) and I (capital i) can be confused for each other in the Latin script.

An example from Hindi – लो can be rendered two ways – ल + ो (correct), and ल + ा + े (incorrect).

Unfortunately, both versions turn up different results, just as l and I would – this is because they are formed and encoded differently, despite looking identical.

Realising this, the Government of India has pushed for character set standardization on mobile devices. This should go a long way in ensuring uniformity in digital content, since most Indians use mobile devices to access the internet.

One prominent feature of India’s linguistic landscape is widespread multilingualism – people often speak languages other than their own language, sometimes including English. This finds its way on to the internet as well.

Search should be able to navigate this multilingual landscape and index results in different languages and scripts, giving users a great degree of interoperability and flexibility across these different languages and scripts.

The search never ends

Ultimately, creating a strong search system is essential in powering content discovery and consumption. As the Indian internet consolidates its Indian language user majority, Indic search will play an increasingly larger role.

Businesses need to keep this in mind and ensure it’s not just their front-end interfaces that are Indian language friendly, but their search engines too.

share: