Generate high-resolution images with text prompts
Vision Transformer Attention Visualization
Generate and convert speech using text and audio inputs