Yan: The Open-Source Framework for Real-Time, AI-Powered Interactive Video Creation

Yan is a new opensource tool from Tencent that lets people make and change interactive, AIpowered videos in real time at super high quality. It has three smart parts: one to simulate action, one to create new visuals from words or pictures, and one to edit everything live. Teachers, marketers, and game makers can use Yan to build cool, customizable videos that respond instantly to what users do. Yan works fast, is easy to use, and stands out because it's both open and powerful. You can start usi

Yan is a new open-source tool from Tencent that lets people make and change interactive, AI-powered videos in real time at super high quality. It has three smart parts: one to simulate action, one to create new visuals from words or pictures, and one to edit everything live. Teachers, marketers, and game makers can use Yan to build cool, customizable videos that respond instantly to what users do. Yan works fast, is easy to use, and stands out because it's both open and powerful. You can start using Yan right away through simple online tools.

What is Yan and how does it enable real-time, AI-powered interactive video creation?

Yan is an open-source framework released by Tencent that allows users to create and edit interactive, AI-powered videos in real time at 1080p 60 FPS. It combines simulation, generative, and editing modules for instant, high-quality, customizable video content.

Yan is the first open-source framework capable of interactive, AI-powered video creation that runs in real time at 1080p 60 FPS. Released in August 2025 by Tencent's research team, it unites three tightly-coupled modules: Yan-Sim for AAA-grade simulation, Yan-Gen for text- and image-guided generation, and Yan-Edit* for live, multi-level editing. Together they let educators, marketers, and game designers generate, steer, and re-shape video streams as easily as editing a slide deck.

How Yan Works in a Nutshell

Module	Core Tech	What It Does
Yan-Sim	3D-VAE + KV-cache shift-window denoising	Simulates physics, lighting, and interactive mechanics at 60 FPS
Yan-Gen	Diffusion model + hierarchical autoregressive captioning	Generates new frames guided by text, images, or live user input
Yan-Edit	Hybrid neural-network renderer	Re-skins, re-lights, or re-structures any frame in real time

All three share a *Self-Forcing * training loop that keeps long sequences stable by forcing the model to predict the next frame from its own earlier outputs, a breakthrough that removes the "training wheels" plaguing earlier video AIs.

Key Performance Numbers (confirmed in arXiv paper)

Resolution: 1080p
Frame rate: 60 FPS sustained on a single consumer GPU
Latency: < 50 ms for interactive edits
Sequence length: unlimited (autoregressive)

Immediate Use Cases

Education
Teachers can spawn virtual chemistry labs where students change reagent concentrations mid-experiment and see the reaction unfold instantly. Early prototypes built with Yan have cut concept-review time by 42 %* compared with static video lessons.
Marketing & E-commerce
Brands already preview interactive ads that let shoppers rotate products, swap colorways, or drop themselves into aspirational scenes via a selfie. Internal tests show 2.3× higher click-through rates* versus non-interactive pre-roll video.

Competitive Snapshot

Framework	Real-time Edit	AAA Simulation	Open Source	1080p 60 FPS
Yan	✓	✓	✓	✓
OpenAI Sora	✗	✗	✗	✓ (batch)
Runway Gen-3	partial	✗	✗	✓
MIT CausVid	✓	✗	✓	✓ (720p)

Data compiled from The Neuron and the official Yan paper.

Getting Started

The project page offers ready-to-run notebooks, Docker images, and a browser playground that converts a single text prompt into an editable 10-second clip in under 15 seconds.

What exactly is Yan and why is it getting attention this year?

Yan is an open-source AI framework released by Tencent in August 2025 that lets anyone create interactive, 1080p 60 FPS videos on-the-fly. It combines three tightly-linked components: Yan-Sim (real-time physics and simulation), Yan-Gen (multi-modal video creation from text and images), and Yan-Edit (frame-by-frame editing while the video is running). The key breakthrough is the Self-Forcing training method that keeps every new frame consistent with the ones before it, removing the "drift" that plagued earlier diffusion models.

How does Yan achieve real-time 1080p 60 FPS performance on consumer hardware?

Three engineering choices make this possible:

3D-VAE compression + KV-cache: A 3-Dimensional Variational Auto-Encoder compresses each scene state up to 200×, and a KV cache stores previously generated frames so only the "delta" between frames needs to be computed.
Shift-window denoising: Instead of denoising the entire frame, Yan applies a sliding-window approach that touches just enough pixels to retain visual fidelity while cutting GPU load.
Self-Forcing training: The model learns to predict the next frame from its own previous output, eliminating expensive re-rendering and reducing latency to <16 ms on a single RTX 4090.

The result: 1080p 60 FPS streams at ~4 GB VRAM, benchmarked in the original paper with no special optimizations.

What can I build with Yan today, and are there any real-world deployments?

As of August 2025, no verified commercial roll-outs have been announced yet, but the early demo ecosystem already shows three concrete use-cases:

Interactive education labs - teachers can spawn live chemistry or physics simulations that students can pause, rewind, or re-parameterize with voice or text prompts.
Dynamic marketing assets - brands can generate personalized video ads that change products, colors, or slogans on the viewer's device in real time.
AI-native mini-games - indie developers replaced Unity pre-rendered cut-scenes with Yan-generated sequences that branch based on player choices, cutting asset size by 70 %.

Tencent's project page hosts walk-through videos for each scenario, and the permissive MIT license allows commercial use without royalties.

How does Yan compare with Sora, Runway, and other diffusion-based tools?

Tool	Real-time?	Editing while running?	Domain blending?	License
Yan	Yes	Frame-level	Yes	MIT (open)
Sora	No	Post-render only	No	Proprietary
Runway	Partial	Scene-level	Limited	Commercial
LTX Video	No	Post-render only	No	Apache 2.0

The critical gap is that current diffusion tools are optimized for batch rendering. Yan is the first open stack designed for interactive pipelines where the user (or another AI agent) can steer the narrative and visuals continuously.

Is Yan truly open source, and how can I start contributing?

Yes. The full codebase, pre-trained weights (8.6 GB), and build scripts are live on GitHub under the MIT license. Early metrics (August 2025):

1,420 stars, 87 forks, 43 merged PRs in the first 9 days
Discord channel has 1,100 members, 35 % of whom are actively submitting bug reports or enhancement requests
Roadmap issues include WebGPU backend support, macOS Metal optimization, and a plug-in system for custom domain models

To jump in, clone the repo, run the bundled start_demo.py script, and you'll have a 30-second interactive clip rendering in under 5 minutes on an average gaming laptop.