JjjFangg commited on
Commit
418fc43
·
verified ·
1 Parent(s): f13a85c

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +11 -1
README.md CHANGED
@@ -50,7 +50,7 @@ Application: https://github.com/bytedance/UI-TARS-desktop
50
  | Benchmark | UI-TARS-1.5 | OpenAI CUA | Claude 3.7 | Previous SOTA |
51
  |-----------|-------------|------------|------------|----------------|
52
  | [ScreensSpot-V2](https://arxiv.org/pdf/2410.23218) | **94.2** | 87.9 | 87.6 | 91.6 |
53
- | [ScreenSpotPro](https://github.com/lixaixin2000/ScreenSpot-Pro-GUI-Grounding?tab=readme-ov-file) | **61.6** | 23.4 | 27.7 | 43.6 |
54
 
55
 
56
 
@@ -76,6 +76,16 @@ Application: https://github.com/bytedance/UI-TARS-desktop
76
  | | (chicken) | 0.1 | 0.0 | 0.4 | 0.5 | 0.6 |
77
  | | **100 Tasks Avg.** | 0.04 | 0.03 | 0.18 | 0.25 | 0.31 |
78
 
 
 
 
 
 
 
 
 
 
 
79
 
80
 
81
 
 
50
  | Benchmark | UI-TARS-1.5 | OpenAI CUA | Claude 3.7 | Previous SOTA |
51
  |-----------|-------------|------------|------------|----------------|
52
  | [ScreensSpot-V2](https://arxiv.org/pdf/2410.23218) | **94.2** | 87.9 | 87.6 | 91.6 |
53
+ | [ScreenSpotPro](https://arxiv.org/pdf/2504.07981v1) | **61.6** | 23.4 | 27.7 | 43.6 |
54
 
55
 
56
 
 
76
  | | (chicken) | 0.1 | 0.0 | 0.4 | 0.5 | 0.6 |
77
  | | **100 Tasks Avg.** | 0.04 | 0.03 | 0.18 | 0.25 | 0.31 |
78
 
79
+ # Model Scale Comparison
80
+
81
+ This table compares performance across different model scales of UI-TARS on the OSworld benchmark.
82
+
83
+ | **Benchmark Type** | **Benchmark** | **UI-TARS-72B-DPO** | **UI-TARS-1.5-7B** | **UI-TARS-1.5** |
84
+ |--------------------|------------------------------------|---------------------|--------------------|-----------------|
85
+ | Computer Use | [OSWorld](https://arxiv.org/abs/2404.07972) | 24.6 | 27.5 | **42.5** |
86
+ | GUI Grounding | [ScreenSpotPro](https://arxiv.org/pdf/2504.07981v1) | 38.1 | 49.6 | **61.6** |
87
+
88
+
89
 
90
 
91