Why LLMs should be “decolonised”

LLMs today need to speak a variety of languages, not just English with an accent.

Unless AI gets culture right, it will remain brilliant… but foreign. Let’s build its soul into the machine.
Unless AI gets culture right, it will remain brilliant… but foreign. Let’s build its soul into the machine.

I was chatting with a friend in Jakarta last week. He asked one of the popular LLMs to write a heartfelt retirement speech for his father, something warm, respectful, a little funny, the kind of thing that would make the old man tear up in the best way.

The model delivered. Perfect grammar, soaring metaphors about ‘legacy’ and ‘dedication’, the usual Hallmark polish.

But it was also completely wrong.

It missed the gentle teasing wrapped in reverence, the weight of unspoken pride, the specific Javanese shorthand for ‘I love you’ that never actually says ‘I love you.’ My friend laughed and said, “It felt like a well-meaning American who read about Indonesian fathers in a guidebook.”

Although I laughed with him, I felt a sense of quiet crisis.

The problem isn't that AI can't write a good retirement speech for an Indonesian father. The problem is that as this technology seeps into everything (from healthcare and education to finance and governance), its inherent cultural biases become features, not bugs.

This is the hidden fracture line running through our entire AI revolution: we’re building a supposedly global brain on a corpus that is overwhelmingly Western, English-first, individualist, and let’s be honest, it’s very Silicon Valley. The dreams are in English; the jokes are Reddit-inspired; and the moral compass was calibrated in San Francisco.

Every LLM, whether its GPT, Claude, or Gemini, is trained on oceans of text scraped from the internet. That text is mostly in English, and even more importantly, it carries the tone, assumptions, and value systems of the Western world. Its not anyones fault; its just how the digital universe evolved. But it does mean that these models have what Id call a Western accent”, not just linguistically, but culturally.

It’s brilliant. It’s also wearing cargo shorts and flip-flops to everyone else’s wedding.

Outside the West, people might ditch these tools because they feel alien or inaccurate. This Western DNA affects adoption. Big time. People outside the West dont just want tools, they want tools that get them. If an AI keeps assuming everyone values self-expression over harmony, or doesnt know the difference between Diwali and Christmas, or cant navigate honour-based communication, guess what? It gets ignored. Adoption slows, innovation stalls. Local problems stay unsolved.

So, how do we fix this? How do we adapt this brilliant but culturally short-sighted technology? We can't just translate English models. We have to re-conceive” them through three layers – culture, context, and governance.

Layer 1 – Ground the model in culture, not just vocabulary

Start with the data, but don’t stop at translation.

India isn’t “Hindi + English.” It’s Tamil poetry at 3 am, Bengali sarcasm sharp enough to cut glass, Punjabi bhangra lyrics, and WhatsApp forwards from your auntie that somehow contain the entire moral universe. An AI here has to know that “yes” can mean “maybe,” that mental health is spoken in metaphors, that Diwali isn’t a holiday…it’s identity. Researchers are already building Indian-language benchmarks and finding that simple prompt engineering isn’t enough (Drishtikon); the model needs to be marinated in regional films, news, folklore, and the thousand micro-cultures inside one nation.

The Middle East isn’t “Arabic.” It’s the difference between Cairene banter and Khaleeji restraint, between formal Fusha and the poetry of the street. An AI has to feel the rhythm of Ramadan, understand when “inshallah” is fatalism and when it’s politeness, and never, ever step on family honour. Governments in Riyadh and Abu Dhabi are already funding Arabic-first models that bake Islamic ethics and hospitality into the weights from day one, such as Saudi Arabia’s Allam.

North Asia (Japan, Korea, China) are tech giants who would laugh at being lumped together. Japan runs on wa (harmony above ego). Korea moves at bullet-train speed but still bows to hierarchy. China scales guanxi (social network) like oxygen. An AI here must master face-saving, know when silence is the correct answer, and never embarrass the elder in the chat. Benchmarks for East Asian cultural alignment are already showing how badly untuned models flatten Confucian or Shinto nuance.

Africa with 54 countries, 2,000+ languages rejects the idea of a single story. Ubuntu (human values) isn’t a buzzword; it’s philosophy. Storytelling is oral, decisions are communal, and resilience is the default setting. Decolonising LLMs here means training on griot traditions, Nollywood scripts, Swahili rap battles, and village WhatsApp groups; not Oxford English with an accent. LLMs need to speak Swahili, Yoruba, Amharic, not just English with an accent. They need to understand village decision-making, not Silicon Valley boardrooms.

Eastern Europe carries post-Soviet memory in its bones. Dry humour, deep history, and a fierce allergy to being anyone’s footnote. An AI that treats Warsaw or Kyiv like “just another EU capital” will be quietly despised. Additionally, governance adds another layer. EU influences mean risk-based regs, but countries like Estonia lead with digital innovation, blending Western standards with local needs for data sovereignty.

In a nutshell, culture shapes not just how we speakbut how we think, and thats where things get tricky for global AI adoption. To ground the model in the culture, we need:

  • Culturally curated corpora: We must build massive, high-quality training datasets in local languages, fed by regional literature, news archives, film scripts, social media (ethically sourced), and historical texts. This isn't just Arabic, but the specific dialects of the Gulf vs. the Levant. It's not just "Chinese," but understanding the cultural nuances in a post from Weibo versus a government white paper.
  • Local legends, not global heroes: The model should know about Chhatrapati Shivaji Maharaj” as readily as it knows about George Washington. It should understand the significance of Sun Wukong” in China, the philosophical depth of Wolof” proverbs in Senegal, and the pop culture phenomenon of BTS” in Korea.
  • Festivals, food, and feelings: It should know that Ramadan is more than fasting; it's about community and reflection. It should understand the emotional charge of Diwali, the quiet respect of Chuseok, and the vibrant chaos of Carnival in Brazil.

Layer 2 – Make the model outputs reflect the norms of a community

Once the data layer captures cultural specificity, the generation layer must produce outputs that respect the communication patterns of the community. Culture isn’t what we say, it’s how we behave when no one’s watching. This is where psychology, anthropology, and sociology crash the party.

We need to take decades of research, such as Hofstede’s cultural dimensions, high-context vs low-context communication, power distance, uncertainty avoidance, and bake them into the model’s response style.

In high-context cultures (Japan, Korea, Arab world, much of Africa and Southeast Asia), meaning lives between the lines. The AI must learn to imply, to leave graceful exits, to read the air (kuuki o yomu). In high power-distance societies, it must never correct the boss in public. In parts of Eastern Europe scarred by unstable history, people crave clarity and trust-building rituals. An AI that sounds chaotic or flippant gets ignored fast.

This isn’t window dressing. When people feel an AI gets their invisible social rules, trust skyrockets. When the systems outputs reflect local norms, user confidence rises and engagement improves, which is essential for meaningful interaction.

Training LLMs on ethnographic studies, local psych research and social norms data teaches the system to modulate tone, formality, and indirectness to align with local expectations—ensuring relevance and enhancing user trust.  This requires a deep collaboration not just with computer scientists, but with anthropologists, sociologists, and linguists from the target regions.

Some methods for contextual calibration are:

  • Ethnographic data ingestion ensures that the system is aware of culturally salient references.
  • Using social‑norm signals derived from community‑specific surveys inform the weighting of response styles.
  • Embedding cross‑cultural metrics in the objective function to guide the model toward culturally appropriate output.

Layer 3 – Make AI governance guardrails, not definitive rules

One ethical framework to rule them all is the new digital colonialism. One size doesnt fit anyone.

  • China's Model: Prioritises social stability and state control. An LLM governed for China must align with these societal goals.
  • EU's Model (GDPR): Focuses on individual privacy and data rights as fundamental human rights.
  • India's Potential Model: India needs data sovereignty. It might have to balance its vibrant, argumentative democracy with its diverse religious and cultural sensitivities. Its approach to free speech and misinformation will be its own.
  • African Nations' Models: Africa wants inclusion. It might prioritise community rights over individual rights and focus heavily on using AI for leapfrogging in agriculture, healthcare, and financial inclusion.
  • The Middle East needs faith-aligned ethics.

These governance modules must be implemented to function as configurable guardrails, providing transparent alignment layers that stakeholders can adapt to local law and cultural norms. The approach treats governance as a set of adaptable constraints rather than a single, uniform requirement.

What is coming next?

LLMs with Western DNA wont cut it globally. We need cultural co-creation. Local data. Local teams. Local values. Not as an afterthought, but from day one.

The first generation of LLMs mirrored the Wests digital self-image. Future iterations must:

  • Curate high‑quality, dialect‑aware corpora that incorporate local literature, media, and cultural markers.
  • Integrate cross‑cultural metrics and social‑norm datasets into the training pipeline.
  • Deploy modular governance components that can be toggled per jurisdiction.
  • Validate outputs through localised user studies to ensure cultural alignment and trust.

Achieving these steps requires collaboration across disciplines, including data scientists, cultural anthropologists, sociologists, legal experts, and regional linguists. We need to work together to create systems that serve each community effectively without imposing a one‑size‑fits‑all model.

Because culture isnt noise in the data; it “IS” the data. And unless we get that right, AI will remain brilliant… but foreign. Let’s build its soul into the machine.


-Nishith Srivastava, founder, The Agentics Co.

 

 

 

-Tijana Nikolic, consulting partner, AI – strategy, governance and delivery, The Agentics Co.

 

 

 

 

Source:
Campaign India

Follow us

Top news, insights and analysis every weekday

Sign up for Campaign Bulletins

Related Articles

Just Published

1 day ago

When corporate chaos takes a day off

AMD’s Zen Mode film imagines an office where pressure disappears by using calm, not jargon, to make enterprise tech feel human.

1 day ago

EBITDA targets and vowel-free branding drive ...

The martech agency is executing a 26-company acquisition roadmap to achieve a 100-crore profit benchmark for its public market debut.

1 day ago

Roast me, please! When brands laugh at themselves

Brands are turning online jokes into campaigns that celebrate authenticity, confidence and agility.

1 day ago

2025 Rewind: A year in meme marketing, Asia edition

In Asia, brands are letting loose with slightly unhinged, super local, and totally uncorporate speak online. See more than just '67' in Campaign's collection of 2025's best brand memes.