metadata

tags:
  - SanaControlNetPipeline
base_model:
  - Efficient-Large-Model/Sana_600M_1024px_diffusers
pipeline_tag: text-to-image

Model card

We introduce Sana, a text-to-image framework that can efficiently generate images up to 4096 × 4096 resolution. Sana can synthesize high-resolution, high-quality images with strong text-image alignment at a remarkably fast speed, deployable on laptop GPU.

Source code is available at https://github.com/NVlabs/Sana.

🧨 Diffusers

1. How to use `SanaControlNetPipeline` with `🧨diffusers`

# run `pip install git+https://github.com/huggingface/diffusers` before use Sana in diffusers
import torch
from diffusers import SanaControlNetModel, SanaControlNetPipeline
from diffusers.utils import load_image

pipe = SanaControlNetPipeline.from_pretrained(
    "ishan24/Sana_600M_1024px_ControlNetPlus_diffusers",
    variant="fp16",
    torch_dtype=torch.float16,
    device_map="balanced"
)

pipe.vae.to(torch.bfloat16)
pipe.text_encoder.to(torch.bfloat16)

cond_image = load_image(
    "https://huggingface.co/ishan24/Sana_600M_1024px_ControlNet_diffusers/resolve/main/hed_example.png"
)
prompt='a cat with a neon sign that says "Sana"'
image = pipe(
    prompt,
    control_image=cond_image,
).images[0]
image.save("sana.png")

Model card

🧨 Diffusers

1. How to use SanaControlNetPipeline with 🧨diffusers

1. How to use `SanaControlNetPipeline` with `🧨diffusers`