Overall Framework
LongLive-2.0 treats algorithm and infrastructure as one system. On the training side, Balanced SP and NVFP4 make long-video AR fine-tuning practical. On the inference side, W4A4 execution, NVFP4 KV cache, parallel dequantization, and asynchronous VAE decoding improve end-to-end throughput.
LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation