Avataar
An India-built AI video model (Varya) tuned for Indian context - festivals, clothing, food, streetscapes - that generates clips in four steps for about Rs 0.48/second on its hosted service, roughly 20x cheaper than Veo or Runway. Released as open weights on India's AIKosh.
Work at Avataar? Manage this listing
Our take
Avataar's Varya is an Indian video model built for local context - it renders Indian festivals, clothing and streets accurately instead of generic Western defaults. Distilled from Wan 2.2, it makes a 5-second 720p clip in about 45 seconds for roughly Rs 0.48/second, far below Veo or Runway, and ships as open weights on AIKosh. Brand new, so quality still trails the frontier.
Best for
Indian e-commerce and marketing teams that need large volumes of culturally accurate product video without frontier-model prices.
Pros
- About Rs 0.48/second - roughly 20x cheaper than Veo, Kling or Runway
- Trained on Indian clothing, food, festivals and streetscapes
- Makes a 5s 720p clip in about 45 seconds (four-step model)
- Released as open weights on India's AIKosh - you can self-host
Cons
- Brand new (June 2026); visual quality trails frontier models
- Distilled from Alibaba's Wan 2.2, not built from scratch
- Hosted API and tooling are still early
How it compares
Against global video tools in our catalog like Runway, Kling and JoggAI, Varya's bet is India-specific realism at a fraction of the cost, plus open weights you can run yourself.
Full review
Most text-to-video models default to a Western visual vocabulary, so Indian clothing, food and festivals come out wrong. Avataar trained Varya specifically on Indian cultural data to fix that, and pairs it with a distillation approach that cuts a clip down to four generation steps - a 5-second 720p clip in about 45 seconds on an H200, versus more than twenty minutes for Wan 2.2.
The headline is price: about Rs 0.48 (roughly half a US cent) per second on the hosted service, around 20x cheaper than Veo, Kling, Luma or Runway, with the weights released openly on the government's AIKosh portal so teams can self-host. The catch is maturity - it launched in June 2026, it's distilled from Alibaba's Wan 2.2 rather than built ground-up, and frontier models still win on raw fidelity. For Indian e-commerce video at scale, the economics are hard to ignore.
Cloudkart Trust Graph
4.0/5- Actual Utility4/5
Source: Initial LLM-authored rubric (backfill)
- Ease of Use3/5
Source: Initial LLM-authored rubric (backfill)
- Pricing Fairness5/5
Source: Initial LLM-authored rubric (backfill)
- Reliability3/5
Source: Initial LLM-authored rubric (backfill)
- Differentiation5/5
Source: Initial LLM-authored rubric (backfill)
Scored as of . Each score is versioned and auditable; vendors cannot buy it.
How this score is set
- Editorial rubric
- Primary signal — five dimensions, 4.0/5 average.
- Community reviews
- None yet.
- Pricing verified
- Not yet verified
- Independence
- Score set by our editorial team before any affiliate relationship is considered. No vendor can buy it.
Frequently asked questions
- Is Avataar free, and how much does it cost?
- Avataar is a paid tool.
- Who is Avataar best for?
- Indian e-commerce and marketing teams that need large volumes of culturally accurate product video without frontier-model prices.
- How is Avataar rated on Cloudkart.ai?
- Avataar scores 4.0 out of 5 on the Cloudkart.ai rubric, which weighs actual utility, ease of use, pricing fairness, reliability and differentiation. Scores are set editorially and can never be bought.
Community reviews
No community reviews yet. Be the first to share how Avataar works for you.
Relevant tools
More tools in Video & Audio Generation.
Sora 2
OpenAI's flagship text-to-video-and-audio model, generating clips with synchronized dialogue and sound effects and improved physical realism. Available via the Sora app and web, free to start with limits and paid tiers for more. Replaced the original Sora, which was retired in April 2026.
Google Veo 3
Google's flagship text-to-video model and the first to generate synced audio - dialogue, effects and ambient sound - in the same pass, with strong physics and prompt adherence. Available in the Gemini apps, the Flow tool and the Gemini/Vertex API. Consumer access via Google AI Pro ($19.99/mo) or Ultra ($249.99/mo); API from $0.40/sec, or $0.15/sec with Veo 3 Fast. Limited free trials in Google AI Studio.
Seedance
ByteDance's AI video generator. Seedance 2.0 (Feb 2026) takes text, images, video and audio together and generates video with native, lip-synced audio in 8+ languages, up to 2K and 4-15 seconds, including multi-shot scenes. Reachable through ByteDance's Dreamina app with free credits and via API platforms.
fal
fal is a serverless platform for running generative media models - image, video, audio and 3D - behind one fast API. Developers call models like FLUX, Wan, Veo and Seedream without managing GPUs, and pay only for successful outputs (for example $0.03 per image, $0.05 per second of video), with no subscription and $20 in free credits to start. It has become a default home for open and commercial media models.
Compare Avataar head-to-head: vs Sora 2 · vs Google Veo 3 · vs Seedance · vs fal