4bit SanaPipeline¶

1. Environment setup¶

Follow the official SVDQuant-Nunchaku repository to set up the environment. The guidance can be found here.

1-1. Quantize Sana with SVDQuant-4bit (Optional)¶

Convert pth to SVDQuant required safetensor

python tools/convert_scripts/convert_sana_to_svdquant.py \
      --orig_ckpt_path Efficient-Large-Model/SANA1.5_1.6B_1024px/checkpoints/SANA1.5_1.6B_1024px.pth \
      --model_type SanaMS1.5_1600M_P1_D20 \
      --dtype bf16 \
      --dump_path output/SANA1.5_1.6B_1024px_svdquant_diffusers \
      --save_full_pipeline

follow the guidance to compress model Quantization guidance

2. Code snap for inference¶

Here we show the code snippet for SanaPipeline. For SanaPAGPipeline, please refer to the SanaPAGPipeline section.

import torch
from diffusers import SanaPipeline

from nunchaku.models.transformer_sana import NunchakuSanaTransformer2DModel

transformer = NunchakuSanaTransformer2DModel.from_pretrained("mit-han-lab/svdq-int4-sana-1600m")
pipe = SanaPipeline.from_pretrained(
    "Efficient-Large-Model/Sana_1600M_1024px_BF16_diffusers",
    transformer=transformer,
    variant="bf16",
    torch_dtype=torch.bfloat16,
).to("cuda")

pipe.text_encoder.to(torch.bfloat16)
pipe.vae.to(torch.bfloat16)

image = pipe(
    prompt="A cute 🐼 eating 🎋, ink drawing style",
    height=1024,
    width=1024,
    guidance_scale=4.5,
    num_inference_steps=20,
    generator=torch.Generator().manual_seed(42),
).images[0]
image.save("sana_1600m.png")

3. Online demo¶

1). Launch the 4bit Sana.

python app/app_sana_4bit.py

2). Compare with BF16 version

Refer to the original Nunchaku-Sana. guidance for SanaPAGPipeline

python app/app_sana_4bit_compare_bf16.py