Generate realistic dialogue from a script, using Dia!
FitDiT is a high-fidelity virtual try-on model.
Control appearance or pose in person images using reference images
Zero Shot voice cloning with llasa 3b (Unofficial Demo)
Generate realistic voice audio from text and audio prompts
Generate audio from text with voice customization
Upgraded to v1.0!
An end-to-end (e2e) Voice Language Model by Fish Audio.