New AI model generates 45-minute lip-synced video from one photo and runs in real time
What Happened
A single image becomes a talking character: LPM 1.0 generates real-time video with lip sync, facial expressions, and emotional reactions. For now, it remains a research project. The article New AI model generates 45-minute lip-synced video from one photo and runs in real time appeared first on The D
Our Take
LPM 1.0 generates real-time lip-synced video from a single static image — up to 45 minutes of output with facial expressions and emotional reactions. No public API yet; research only.
If you're building avatar pipelines on HeyGen or D-ID, you're paying per-minute generation costs that compound fast at any meaningful scale. Most developers overbuild for video generation assuming it requires heavy async queues — LPM's real-time constraint removes that assumption entirely. When this ships, the infrastructure bet changes.
Avatar-heavy products (onboarding, e-learning, agent personas) should watch the release. Everyone else can ignore it for now.
What To Do
Avoid committing to HeyGen's per-minute pricing in new avatar pipelines because LPM-class real-time generation will make async queued workflows unnecessary overhead.
Builder's Brief
What Skeptics Say
A research demo with no latency or hardware specs disclosed is not a capability claim — it is a benchmark for what regulators and platform trust-and-safety teams will have to contain before this reaches production.
1 comment
45 minutes. real time. ONE photo. deepfakes are fully cooked, we are so done
Cited By
React
Get the weekly AI digest
The stories that matter, with a builder's perspective. Every Thursday.