
In January 2026, something remarkable happened. Over 1,407 teams from across India came together for a single mission: make India’s data and digital resources ready for responsible AI use. The AI for All Challenge wasn’t just another hackathon — it was a statement about what India’s developer community can accomplish when pointed at the right problem.
Organized by Factly in collaboration with Meta and hosted on Reskilll, this nationwide online hackathon proved that India is ready to tackle its biggest data challenges head-on.
What Was the AI for All Challenge?
AI for All — India’s Open Data and AI-Readiness Challenge — was designed around a simple but powerful idea: India generates massive amounts of public data, but most of it isn’t ready for AI. Datasets are messy, unstructured, poorly documented, and scattered across hundreds of government portals and agencies.
The hackathon challenged participants to build open-source solutions that bridge the gap between raw data and AI-ready resources. Think of it as building the infrastructure layer that India’s entire AI ecosystem needs to actually function at scale.
Running from January 5 to January 23, 2026, the online format meant anyone in India could participate — from computer science students in small towns to experienced data engineers in metro cities. No travel required, no venue constraints, just pure problem-solving.
Why Open Data Matters for India’s AI Future
India generates enormous amounts of data — from census records and agricultural surveys to health statistics, economic indicators, and environmental monitoring. According to data.gov.in, India’s open data portal hosts thousands of datasets across dozens of government departments.
But there’s a critical problem that most people don’t realize: the vast majority of this data exists in formats that AI systems can’t easily use.
- PDFs everywhere — government reports that should be structured databases are locked in PDF format, often scanned images rather than searchable text
- Language barriers — Hindi and regional language documents with no machine-readable versions, making them invisible to most AI systems
- Missing metadata — datasets without proper documentation, column descriptions, or data dictionaries
- Fragmented access — data scattered across multiple portals, ministries, and state governments with no unified access point or standard format
- Quality issues — missing values, inconsistent formats, duplicate entries, and outdated records
Without solving these fundamental problems, India’s AI ambitions remain limited. You can’t build AI solutions for agriculture if the crop data is locked in Hindi PDFs. You can’t create healthcare AI if patient statistics are scattered across 28 state health departments in different formats.
The AI for All Challenge tackled this infrastructure problem head-on by asking developers to build the tools and pipelines that make open data actually usable for AI.
The Scale of Participation
1,407 teams is a significant number for any hackathon — and it’s especially impressive for one focused on data infrastructure rather than flashy consumer apps. The participation reflects the growing awareness among Indian developers that data readiness is the bottleneck for AI progress.
The participants came from remarkably diverse backgrounds:
- Computer science students building their first data pipelines and learning about data engineering
- Experienced data engineers from companies, working on production-grade solutions they could open-source
- Civic tech enthusiasts passionate about government transparency and public data access
- AI researchers exploring new approaches to automated data preparation and cleaning
- Journalism and social science students who understand the importance of data for public accountability
The online format, managed through Reskilll’s platform (the same system that handles 92,403 team registrations across 196 hackathons), ensured smooth registrations, team formation, and submissions even at this scale.
What Teams Built
The solutions that emerged from the AI for All Challenge covered a wide range of approaches to India’s data readiness problem:
Data Extraction and Structuring Tools
Several teams built tools that automatically extract structured data from government PDFs and reports. Using AI-powered OCR combined with natural language processing, these tools convert unreadable scanned documents into clean, queryable datasets with proper column headers and data types.
One approach used Gemini’s vision capabilities to understand table layouts in complex PDF reports, extracting data that traditional OCR tools miss entirely.
Multilingual Data Processing Pipelines
India’s data exists in dozens of languages. Teams built pipelines that process Hindi, Tamil, Telugu, Kannada, Bengali, and other regional language documents, translating metadata and column names while preserving the original data integrity. This makes datasets accessible to AI systems that primarily work in English while maintaining the original language content.
Automated Data Quality and Validation
Some teams focused on building automated quality checks — tools that scan public datasets and identify missing values, format inconsistencies, statistical outliers, duplicate entries, and logical errors. The tools then suggest corrections with confidence scores, allowing human reviewers to quickly validate and fix issues.
Unified Data Access Platforms
Others built aggregation platforms that pull data from multiple government portals into a single, searchable interface with standardized formats, proper documentation, and API access. Think of it as a “Google for Indian government data” — one search query across all sources.
Data Documentation Generators
A creative approach from several teams: AI tools that automatically generate data dictionaries, README files, and usage documentation for existing datasets. This solves the metadata problem — making datasets discoverable and understandable without requiring the original data creators to write documentation.
The Partnership: Factly × Meta
Factly, known for their pioneering work in data journalism and open data advocacy in India, brought deep expertise in understanding what makes data useful for public interest applications. They’ve been working with Indian government data for years and understand its challenges intimately.
Meta’s involvement added resources, technical expertise, and global visibility, reflecting the international tech industry’s growing interest in India’s data ecosystem and its potential for responsible AI development.
Together, they designed problem statements that were specific enough to guide participants toward useful solutions but open enough to allow creative approaches. This balance — clear constraints with creative freedom — is what makes a hackathon successful, and it comes from experience.
How Reskilll Powered the Event
Managing 1,407 teams in an online hackathon that runs for nearly three weeks requires robust, reliable infrastructure. Reskilll’s platform handled the complete event lifecycle:
- Registration and verification — automated team signup, member verification, and waitlist management
- Team formation — helping individual participants find teammates with complementary skills
- Submission management — collecting code repositories, documentation, demo videos, and project descriptions
- Communication — keeping all 1,407 teams updated on timelines, announcements, and mentor availability
- Evaluation support — structured judging through the Evaluator Platform
Impact and Legacy
The AI for All Challenge wasn’t just a hackathon — it was a catalyst for India’s open data movement. The impact extends well beyond the event itself:
- Open-source tools — the solutions created during the event are available for anyone to use, fork, and build upon
- Community formation — 1,407 teams means thousands of developers who now understand and care about data readiness
- Problem awareness — the event brought mainstream attention to the data infrastructure gap that limits India’s AI potential
- Skill development — participants gained hands-on experience with data engineering, NLP, and AI pipeline development
For participants, the experience of working on real data problems with real impact is invaluable — far more meaningful than building yet another todo app or chat interface.
What Comes Next for Open Data AI in India
The AI for All Challenge highlighted both the opportunity and the challenge. India has the data, the developers, and the motivation. What’s needed now is sustained effort to:
- Continue building and maintaining open-source data tools
- Advocate for better data publishing standards across government agencies
- Create more events that bring developers and data together
- Build bridges between the civic tech community and the AI research community
Events like this show why hackathons matter beyond the prizes and the swag. They bring together diverse talent around important problems and produce solutions that wouldn’t exist otherwise.
Want to participate in the next big AI hackathon? Browse upcoming events on Reskilll →
Was part of this hackathon! Our team built a tool that converts government PDF reports into structured JSON datasets. We processed 500+ reports from data.gov.in during the hackathon. The open data problem in India is real and this event brought much-needed attention to it.
The multilingual data processing challenge was fascinating. We worked on Hindi-to-English translation of agricultural survey data. The quality of submissions was really high — 1,407 teams is no joke.