Upload folder using huggingface_hub
Browse files
README.md
CHANGED
@@ -50,7 +50,7 @@ Application: https://github.com/bytedance/UI-TARS-desktop
|
|
50 |
| Benchmark | UI-TARS-1.5 | OpenAI CUA | Claude 3.7 | Previous SOTA |
|
51 |
|-----------|-------------|------------|------------|----------------|
|
52 |
| [ScreensSpot-V2](https://arxiv.org/pdf/2410.23218) | **94.2** | 87.9 | 87.6 | 91.6 |
|
53 |
-
| [ScreenSpotPro](https://
|
54 |
|
55 |
|
56 |
|
@@ -76,6 +76,16 @@ Application: https://github.com/bytedance/UI-TARS-desktop
|
|
76 |
| | (chicken) | 0.1 | 0.0 | 0.4 | 0.5 | 0.6 |
|
77 |
| | **100 Tasks Avg.** | 0.04 | 0.03 | 0.18 | 0.25 | 0.31 |
|
78 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
79 |
|
80 |
|
81 |
|
|
|
50 |
| Benchmark | UI-TARS-1.5 | OpenAI CUA | Claude 3.7 | Previous SOTA |
|
51 |
|-----------|-------------|------------|------------|----------------|
|
52 |
| [ScreensSpot-V2](https://arxiv.org/pdf/2410.23218) | **94.2** | 87.9 | 87.6 | 91.6 |
|
53 |
+
| [ScreenSpotPro](https://arxiv.org/pdf/2504.07981v1) | **61.6** | 23.4 | 27.7 | 43.6 |
|
54 |
|
55 |
|
56 |
|
|
|
76 |
| | (chicken) | 0.1 | 0.0 | 0.4 | 0.5 | 0.6 |
|
77 |
| | **100 Tasks Avg.** | 0.04 | 0.03 | 0.18 | 0.25 | 0.31 |
|
78 |
|
79 |
+
# Model Scale Comparison
|
80 |
+
|
81 |
+
This table compares performance across different model scales of UI-TARS on the OSworld benchmark.
|
82 |
+
|
83 |
+
| **Benchmark Type** | **Benchmark** | **UI-TARS-72B-DPO** | **UI-TARS-1.5-7B** | **UI-TARS-1.5** |
|
84 |
+
|--------------------|------------------------------------|---------------------|--------------------|-----------------|
|
85 |
+
| Computer Use | [OSWorld](https://arxiv.org/abs/2404.07972) | 24.6 | 27.5 | **42.5** |
|
86 |
+
| GUI Grounding | [ScreenSpotPro](https://arxiv.org/pdf/2504.07981v1) | 38.1 | 49.6 | **61.6** |
|
87 |
+
|
88 |
+
|
89 |
|
90 |
|
91 |
|