An end-to-end (e2e) Voice Language Model by Fish Audio.
Generate audio from text with voice customization
Gradio demo of CharacterGen (SIGGRAPH 2024)
Try on virtual garments on person images
Segment objects in images and videos using text prompts
Vote on and view 3D leaderboard entries
A demo of MetaVoice 1B, a new TTS model by MetaVoice.