metadata
tags:
- SanaControlNetPipeline
base_model:
- Efficient-Large-Model/Sana_600M_1024px_diffusers
pipeline_tag: text-to-image
Model card
We introduce Sana, a text-to-image framework that can efficiently generate images up to 4096 × 4096 resolution. Sana can synthesize high-resolution, high-quality images with strong text-image alignment at a remarkably fast speed, deployable on laptop GPU.
Source code is available at https://github.com/NVlabs/Sana.
🧨 Diffusers
1. How to use SanaControlNetPipeline
with 🧨diffusers
# run `pip install git+https://github.com/huggingface/diffusers` before use Sana in diffusers
import torch
from diffusers import SanaControlNetModel, SanaControlNetPipeline
from diffusers.utils import load_image
pipe = SanaControlNetPipeline.from_pretrained(
"ishan24/Sana_600M_1024px_ControlNetPlus_diffusers",
variant="fp16",
torch_dtype=torch.float16,
device_map="balanced"
)
pipe.vae.to(torch.bfloat16)
pipe.text_encoder.to(torch.bfloat16)
cond_image = load_image(
"https://huggingface.co/ishan24/Sana_600M_1024px_ControlNet_diffusers/resolve/main/hed_example.png"
)
prompt='a cat with a neon sign that says "Sana"'
image = pipe(
prompt,
control_image=cond_image,
).images[0]
image.save("sana.png")