A hybrid reasoning model that runs locally in your browser.
Generate descriptions based on camera input