Most AI video models top out at 5-10 seconds, then fake longer clips by gluing segments together — and the seams show. ByteDance’s Seedance 2.5, unveiled at the Volcano Engine FORCE conference, generates a single native 30-second clip in one pass. Character faces, lighting, and motion hold steady the whole way through, because audio and video are generated jointly in the same latent space instead of synced after the fact.
What it actually does
It’s a text/image-to-video model with two real upgrades. First, that 30-second native duration, which roughly doubles the ceiling of most closed commercial rivals. Second, it takes up to 50 multimodal reference inputs — images, audio, 3D white models, style refs — up from 12 in the last version. The one to watch is local re-draw: change a single element inside a frame without touching the rest. That’s editing, not just prompting.
API and rollout
Seedance 2.5 is in global enterprise beta now, served through Volcano Engine’s API, with public access landing in early July. Typical use: ads, product demos, short-form content where you need one clean take, not a montage. If you want more on the AI video race, browse our video coverage on topaiproduct.com.
You Might Also Like
- Seedance 2 0 Just Dropped and the Internet Lost its Mind
- Deerflow 2 0 Bytedance Just Dropped and its not Messing Around
- Aident ai Beta 2 Finally Automation That Doesnt Make you Think in Flowcharts
- Adobe Photoshop ai Assistant Just Dropped in Public Beta Edit Photos by Talking to Them
- Skydio Dock x10 u s air Force Drops 9m on Autonomous Drone Patrols Across Middle East Bases

Leave a comment