metadata
license: apache-2.0
base_model:
- OpenGVLab/InternVL2-8B
SpiritSight Agent: Advanced GUI Agent with One Look
Introduction
SpiritSight id a vision-based, end-to-end GUI agent that excels in GUI navigation tasks across various GUI platforms.
Inference
conda create -n spiritsight-agent python=3.9
pip install -r requirements.txt
pip install flash-attn==2.3.6 --no-build-isolation
python infer_SSAgent-8B.py
Acknowledgments
We thank the following amazing projects that truly inspired us: