Representing the voice of stakeholders, be it consumers, voters, businesses, or specialist segments brings the challenge of inclusivity. Without a fair representation of the underlying variation, we cannot hope to fully access the truth about the population in question. The more varied the population the more challenging it is to assess its true character.
The challenge of inclusion is exemplified by the task of representing the heterogeneity that is in India. Representing the voice of India is a complex challenge. The vastness and variation in India are well known. By 2027, India will be the most populous nation in the world. India is one of the countries with the largest number of billionaires in the world, and at the same time there is a vast population, almost the size of population of USA, living in desperate poverty.
Source of complexity: While some of this variation is a consequence of India’s size, much of this heterogeneity is a consequence of a long and complicated history. In particular, the cultural heterogeneity of India poses a substantive challenge to researchers trying to obtain a cogent understanding of behaviour or attitudes of Indians. The real source of the problem is not the variation. If heterogeneity of India was merely a property of the variation that we see on so many dimensions, then diligent application of basic statistical techniques would help us represent India.
The challenge of representing India does not come from the variations on so many factors but from the interconnected-ness between these factors. Let us examine just three such basic factors and how the interaction among these variables creates complexity for representation and research.
Varied geography: In India, data is still typically collected in person. Urban areas particularly metros with a higher heterogeneity in population are harder to represent. Representing Delhi or Mumbai in commercially viable way is a perennial challenge given the spread & rapid growth of these urban centres. Studies with wider coverage that demand a fair representation of urban and rural areas involve inclusion of many research centres and take our field investigators to remote areas. Data collection requires travelling long distances sometimes through treacherous routes, even reaching places without pucca roads. Fair representation of the voice of India across its length and breadth is truly a logistics nightmare. But varied geography also implies variation in agro-climatic conditions, which has a profound influence on the way people live. For instance, studying food habits of Indians would need to account not just the access variation that rural and urban divide creates but also the variation in agro-climatic conditions. Given below is a standard question on food habits taken from a European study of teenagers.
Imagine the variety of responses for such a question in India. The sheer variety of food and dish options that one would need to account for; would itself make the response capture and analysis an arduous task. But beyond the range of food options to be accounted, the more important question is if there is a universally understood unique concept like breakfast? The variations in agro-climatic conditions, impacting occupations and economic status, creates different meal patterns across the country. In many regions it may not be possible to find a distinctive meal called breakfast. Asking such a question will elicit no meaningful response in such places. The lesson here is that we need to examine the contextual realities assumed in the questioning or framed enquiry. The use of a term breakfast assumes that this is a readily recognizable meal. A more inclusive enquiry framing would be to ask about the first meal of the day.
Myriad languages and dialects: It is well known that India has many languages. It may seem fair to think of India as containing a Europe within. But that analogy is not quite apt. The 24 languages spoken in Europe are written in just two scripts. The 22 official Indian languages are written in 14 distinct scripts. Which means, it involves creating an enquiry that is comprehensible to all regions coupled with the challenge of translation and production in all these languages. But variation in languages is not just about translations and scripts. Languages embody cultures, and within it live the legends, the beliefs, the attitudes and values that characterize a culture. For the translation to ring true, it needs to account for cultural variations. Imagine mounting a motivational study in India. Translation of core motivations like power, control and conviviality in all these languages goes beyond the simple task of finding a synonym. In many cases such as conviviality, there may be none. As young researchers learning the craft of designing an enquiry, we are taught to think first in any Indian language and then to translate into English. It is always easier to translate from a proximate language to another such as Tamil to Malayalam or even Hindi to Tamil, rather than use English as the core or hub reference language. What we learn is that translation is not merely capturing concepts in different languages, but the process of presenting them in a culturally relevant way.
Wide contrast in education levels: In the absence of high literacy levels, self-completion as a mode of data collection, standard approach in many parts of the world, is not viable in India. Surveys in India are usually administered to the respondent, which means the field investigators read out questions and the response options, and the respondent answers the question, which is then noted using a digital device or even a printed questionnaire. This manner of investigation has advantages beyond just enabling inclusion of the non-literate and less literate segments. It also creates a uniform enquiry and a more efficient data capture. But there are clear limitations too. Capture of routine behaviour which requires self-completion of diaries, is near impossible for semi-literate and illiterate populations. However, these challenges can be surmounted with a little ingenuity, like getting school children from the family or neighbourhood to record a behaviour diary for their less literate neighbour of family member.
But the real exclusion created by lack of education is more profound and has aspects that cannot be overcome. Education inhibits access of digital technology. If one deploys a digital mode of data collection such as online panels, vast proportions of populations are completely excluded. Can you get a smalltown petty trader through an online panel? No. Can you survey older women through mobile panels? Not likely, because mobile panel samples are sourced from the stream users of gaming and other apps. Similarly, can you speak to the affluent urban milieu through face-to-face data collection? Very difficult, because access to their homes is limited and even intercepting them at places like malls, airports, etc. are limited by security regulations. Here varied modes of data collection needs to co-exist and a complete transition to online data collection is still not feasible. In India, using different modes of data collection simultaneously - hybrid data collection, is not just about accounting for biased representation, but is about inclusion of the full spectrum of relevant population.
The challenge of accounting for variation can be efficiently solved by statistics. But true inclusion can only come through a fair representation. This is not merely a sampling problem. An examination of just three basic sources of variation tells us that inclusion requires a fair representation of the contextual, cultural and access realities of populations. The challenge of inclusion- understanding the world around us in all its complexity and diversity – is about enabling the expression of people’s lived truths.
The author is chief client officer, Ipsos India.