OpenAI Launches IndQA: A Cultural AI Benchmark for Indian Languages
OpenAI has launched IndQA, a groundbreaking benchmark designed to evaluate AI systems on their understanding of Indian culture, languages, and context. Developed with 261 experts across 12 languages, this initiative aims to make AI more inclusive and effective for India’s diverse population.
Key Takeaways
- IndQA is a new benchmark for evaluating AI on Indian culture and languages.
- It features 2,278 questions across 12 languages and 10 cultural domains.
- Developed with 261 Indian domain experts to ensure cultural accuracy.
- Aims to address the limitations of existing English-centric AI evaluations.
Filling a Critical Gap in AI Evaluation
With 80% of the global population not speaking English as their primary language, existing multilingual benchmarks have proven inadequate. Current evaluations focus heavily on translation and multiple-choice tasks, failing to capture cultural context, history, and local nuances that matter to people.
“Today we are rolling out IndQA. Built in collaboration with 261 experts across 12 languages, IndQA fills a key gap by enabling fair and rigorous evaluation that reflects India’s cultural and linguistic diversity,” said Srinivas Narayanan, CTO of B2B Applications at OpenAI.
Comprehensive Cultural Coverage
IndQA spans 2,278 questions across 10 cultural domains including Architecture & Design, Arts & Culture, Food & Cuisine, History, Law & Ethics, Literature, Media & Entertainment, Religion & Spirituality, and Sports & Recreation. The benchmark covers 12 languages: Bengali, English, Hindi, Hinglish, Kannada, Marathi, Odia, Telugu, Gujarati, Malayalam, Punjabi, and Tamil.
Unlike existing benchmarks like MMMLU and MGSM, IndQA probes culturally nuanced, reasoning-heavy tasks that current evaluations struggle to capture. Each datapoint includes a culturally grounded prompt, English translation for auditability, grading criteria, and expert-curated ideal answers.
Rigorous Development Process
Native-level speakers with deep domain expertise drafted reasoning-focused prompts tied to their regions and specialties. Each question was tested against OpenAI’s strongest models including GPT-4o, OpenAI o3, GPT-4.5, and partially against GPT-5 post-public launch.
The company clarified that IndQA isn’t a language leaderboard since questions aren’t identical across languages. Instead, it serves to measure improvement over time within model families.
India as a Starting Point for Global AI Inclusion
With approximately one billion people not speaking English as their primary language and 22 official languages, India represented an obvious starting point for OpenAI’s global inclusion efforts. This work is part of OpenAI’s commitment to improve products and tools for Indian users, making technology accessible to students, farmers, educators, and others.
At a media conference, Mr. Narayanan emphasized, “India can be a beacon of how AI can be used for social good including education, health and farming etc.”
He also noted the company’s growing global developer ecosystem of 4-5 million developers, stating they are “propping up the developer ecosystems so that they can do more with AI. We continue to improve our models, pushing the frontiers of technology to help enterprises to have a better agentic future.”



