Generating a five-second video clip on a high-end model like Alibaba’s Wan 2.2 takes roughly 20 minutes of compute time. For a startup trying to serve millions of users in India, that isn't just a technical hurdle—it’s a financial wall. Avataar AI just smashed that wall.
On Tuesday, the Peak XV-backed startup unveiled Varya, a video generation model that produces clips 10 times faster than its predecessor while costing roughly 20 times less to run. By distilling Wan 2.2’s massive architecture into a leaner, four-step process, Avataar has turned a resource-heavy research project into a tool capable of operating at the scale of India’s massive consumer internet.
This is a strategic pivot for India’s AI ecosystem. While the U.S. and China race to build the largest foundation models, India is betting that its path to dominance lies in extreme efficiency and cultural specificity. Varya is one of the first major outputs from the government’s $1.2 billion India AI Mission, which provides startups with subsidized GPU compute in exchange for open-weight releases.
The Math Behind the Efficiency
The performance gap between Varya and standard models is stark. Using an Nvidia H200 GPU, Varya generates a 720p clip in 45 seconds. The original Wan 2.2 model requires 1,230 seconds to produce the same output.
This speed isn't just for convenience; it is the primary driver of the model's pricing. Avataar plans to charge ₹0.48 ($0.005) per second of video on its hosted service. For context, industry-standard models like Luma, Kling, and Runway typically charge $0.10 or more per second.
“India is a video-first market,” said Rajan Anandan, managing director at Peak XV. “Current AI video models are too expensive for population-scale use in India. If video AI is going to reach students, teachers, MSMEs, and creators, costs have to come down dramatically.”
Solving for Cultural Context
Beyond the price tag, Varya aims to solve the “generic output” problem that plagues many Western-trained models. When prompted to generate scenes involving Indian festivals, traditional clothing, or regional architecture, global models often default to stereotypes or inaccurate representations.
Avataar claims it has addressed this by training Varya on a curated dataset specifically designed to recognize Indian cultural nuances. By releasing the model as open-weight on the government’s AIKosh portal, the company is inviting developers to further refine these capabilities, effectively crowdsourcing the cultural accuracy of the model.
What This Means for Developers
For the Indian developer community, Varya represents a shift from being consumers of foreign models to architects of local solutions. Because the model and its training data are being released publicly, developers can self-host the technology or modify it for specific enterprise use cases, such as localized advertising or educational content.
Avataar is also signaling an openness to integration, stating it is exploring partnerships with existing video tools like Higgsfield and Adobe Firefly.
Key Takeaways
- Cost Disruption: At $0.005 per second, Varya is roughly 20 times cheaper than current market leaders, making it viable for mass-market applications in India.
- Efficiency Gains: Through model distillation, Varya reduces the generation process from 50 steps to just four, resulting in a 10x increase in speed.
- Cultural Nuance: The model is specifically trained on Indian datasets to accurately depict local clothing, food, and festivals, avoiding the generic outputs common in global models.
The Road Ahead
The launch of Varya is a test case for the India AI Mission’s broader strategy: prioritize application-layer innovation over the expensive, high-stakes race for the world’s largest foundation model. With the government aiming to attract $200 billion in AI investment by 2028, the pressure is on to prove that these subsidized models can actually drive adoption.
Whether Varya can scale to the millions of users Avataar envisions will depend on its reliability outside of controlled demos. The company has opened the model for public testing on its website, and the next few months will reveal if the model’s speed and cost-efficiency hold up under real-world, high-volume traffic.