Generate realistic talking video from an image and audio
Segment objects in images and videos using text prompts