--- datasets: - shuaishuaicdp/GUI-World language: - en license: cc-by-4.0 metrics: - bertscore - LLM-as-a-Judge tags: - gui - agent pipeline_tag: video-text-to-text --- This is the first VideoLLM with powerful GUI-oriented capabilities, retrained on [GUI-World](https://gui-world.github.io). It was presented in [GUI-WORLD: A Dataset for GUI-oriented Multimodal LLM-based Agents](https://huggingface.co/papers/2406.10819). See [Github](https://github.com/Dongping-Chen/GUI-World) for how to use GUI-Vid for GUI understanding tasks.