Sora Is Dead. LTX 2.3 (Lightricks) Ships 22B Open-Source Video + Audio in a Single Forward Pass.

The timing is almost poetic. On March 24, OpenAI announced it’s killing Sora — the app, the API, and the billion-dollar Disney partnership that was supposed to define AI video. One day later, Lightricks drops LTX 2.3: a 22-billion-parameter open-source model that generates synchronized video and audio in a single forward pass, at up to 4K resolution and 50 FPS.

No API key required. No subscription. Apache 2.0 license. Run it on your own GPU.

This isn’t a research preview or a waitlist product. LTX 2.3 is a fully deployable Diffusion Transformer (DiT) foundation model — and it’s the first open-source model to ship native audio-video joint generation. While OpenAI retreats from video to “focus compute on AGI,” the open-source ecosystem just leapfrogged what Sora could do at its peak.

What LTX 2.3 Actually Does

LTX 2.3 is built on an entirely new architecture compared to its predecessor. Lightricks nearly tripled the parameter count from roughly 8 billion (LTX-2) to 22 billion, and redesigned the temporal attention mechanism from scratch.

The headline feature: synchronized audio and video generation in a single model pass. Previous approaches required separate models — one for video, one for audio — stitched together in post-processing. LTX 2.3 handles both natively, which means lip sync, ambient sound, and scene-appropriate audio come out of the box without alignment hacks.

Key specs:

Resolution: Up to 4K output, native 1080p generation including 9:16 portrait mode
Frame rate: 50 FPS
Duration: Up to 20 seconds per clip
Modes: Text-to-video, image-to-video, audio-to-video
Audio: Built-in HiFi-GAN vocoder for cleaner sound generation
Customization: LoRA adapter support with up to 3 custom adapters simultaneously
Model variants: Full dev model (bf16) and 8-step distilled version for faster inference

The visual quality improvements are substantial too. A completely rebuilt VAE produces sharper textures, more realistic faces, better hair rendering, and cleaner text in video. A 4x larger gated attention text connector means prompts are followed more closely — one of the biggest complaints with earlier versions.

The Sora-Shaped Hole in AI Video

To appreciate why LTX 2.3 matters right now, you have to understand how badly the commercial AI video market just imploded.

OpenAI launched Sora as a standalone app in September 2025. It shot to the top of the App Store. Disney pledged $1 billion. The narrative was set: closed-source, big-budget AI video was the future.

Then reality hit. Sora’s compute costs were astronomical. Month-over-month installs declined through early 2026. Disney walked away from the deal. And on March 24, OpenAI pulled the plug entirely — reallocating resources to AGI research and robotics. The Sora 2 model still exists behind ChatGPT’s paywall, but the dedicated app and API are gone.

This leaves a vacuum. Google’s Veo 3 is powerful but locked inside Google’s ecosystem. Runway and Kling offer strong commercial products, but at commercial prices. For developers, researchers, and indie creators who want to build on top of video generation — not just consume it through an app — the options just narrowed dramatically.

LTX 2.3 fills that gap. And it fills it with something Sora never offered: full open weights you can fine-tune, deploy, and modify.

How It Stacks Up Against the Competition

The open-source video generation space has gotten crowded in 2026. Here’s where LTX 2.3 sits relative to the field:

vs. Wan 2.1 (Alibaba): Wan 2.1 still leads on raw visual realism for single-subject scenes — skin textures, animal fur, and cinematic close-ups are its strength. But Wan has no native audio generation and requires separate pipelines for sound. LTX 2.3’s integrated audio-video approach is a genuine differentiator for anyone building end-to-end video workflows.

vs. HunyuanVideo (Tencent): HunyuanVideo excels at multi-person scenes and crowd dynamics — something most open-source models struggle with. But it demands high-end hardware (A100/H100 territory) and also lacks native audio. LTX 2.3 is more accessible on consumer hardware and offers the audio advantage.

vs. Google Veo 3: Lightricks claims LTX 2.3 delivers “commercial-grade generation quality on par with Veo 3.” That’s a bold claim, but the key difference is access: Veo 3 is locked to Google’s platforms, while LTX 2.3 is Apache 2.0 open-source. For anyone who needs to self-host, fine-tune, or avoid vendor lock-in, LTX 2.3 is the obvious choice.

vs. Closed commercial tools (Runway, Kling): These remain more polished for non-technical users. But at $0.04/second via Fal AI and similar inference providers, LTX 2.3’s API costs are roughly 5x cheaper than comparable Sora pricing was — and running it locally costs nothing beyond electricity.

Running It Locally: Hardware Reality Check

The full 22B model in bf16 precision needs approximately 44GB of VRAM — meaning an A100 or multi-GPU setup for full-quality generation. But Lightricks shipped several quantized options that make local deployment more realistic:

FP8 quantized: ~22GB VRAM, works on RTX 4090/3090 with some optimization. This is the sweet spot for most users.
Distilled model: 8-step inference for faster generation with lower memory overhead.
CPU offloading: Gets the floor down to around 12GB VRAM (RTX 3060), though generation times stretch to 45-60 minutes per clip.

For comfortable 1080p generation, Lightricks recommends 24GB+ VRAM and 128GB system RAM. The model is natively supported in ComfyUI v0.16 — Lightricks worked directly with the ComfyUI team on integration — and ships with reference workflows for text-to-video, image-to-video, and multi-stage generation with latent upscaling.

There’s also a desktop app that bundles everything for local inference on consumer hardware, which lowers the barrier significantly for non-technical users.

Who Built This and Why It’s Open-Source

Lightricks is a Jerusalem-based company founded in 2013 by five Hebrew University PhD students. They’re best known for Facetune and Videoleap — consumer photo and video editing apps that have been downloaded hundreds of millions of times. The company has raised $335 million from Goldman Sachs, Insight Partners, and others, with a $1.8 billion valuation as of their 2021 Series D.

The pivot to AI video models started in 2022. By late 2024, they released the original LTX-Video (2B parameters). In May 2025, a 13B version that could generate 5 seconds of video in 4 seconds. In January 2026, LTX-2 arrived as the first production-ready open-source audio+video model. Now LTX 2.3 pushes the architecture further with 22B parameters and meaningful quality gains.

The open-source strategy is deliberate. As one Israeli tech publication put it, Lightricks is “aiming for its DeepSeek moment” — betting that open weights will drive adoption, ecosystem development, and ultimately commercial revenue through their LTX Studio product and API services. The model is free for companies under $10M annual revenue; larger organizations need commercial licensing.

The GitHub repository has accumulated over 9,400 stars, with active community development around ComfyUI nodes, custom workflows, and fine-tuning pipelines.

FAQ

Is LTX 2.3 really free to use?
Yes, for most users. It’s released under Apache 2.0, which allows commercial use for organizations under $10M in annual revenue. Above that threshold, you need to contact Lightricks for commercial terms. Self-hosting is completely free regardless of revenue.

What GPU do I need to run LTX 2.3 locally?
The practical minimum is 12GB VRAM (RTX 3060) with CPU offloading enabled, though generation will be slow. For comfortable 1080p work, aim for 24GB+ VRAM — an RTX 4090 or RTX 3090 handles the FP8 quantized model well. Full bf16 quality requires 48GB+ (A100-class hardware).

How does LTX 2.3 compare to Sora?
LTX 2.3 matches or exceeds Sora’s capabilities in several areas: longer clips (20s vs Sora’s typical 5-15s), higher frame rates (50 FPS), native audio generation (Sora had no built-in audio), and open weights for customization. Sora arguably had an edge in prompt understanding for complex narrative scenes, but that’s now moot since OpenAI shut the product down.

Can I use LTX 2.3 through an API instead of running it locally?
Yes. Several inference providers host the model, with pricing around $0.04 per second of generated video in fast mode. Lightricks also offers its own API with both fast and pro tiers.

What are the best alternatives to LTX 2.3?
For pure visual quality: Wan 2.1 (Alibaba). For multi-person scenes: HunyuanVideo (Tencent). For a polished commercial product: Runway Gen-4 or Kling 3.0. For a Google-ecosystem solution: Veo 3. LTX 2.3’s unique advantage is the combination of open-source access, native audio generation, and a practical local deployment story.

Top AI Product

Leave a comment Cancel reply