Does UI-TARS-7B-DPO have official multilingual benchmarks (en/ru/zh/etc)?

#8
by unstopppable - opened

The model bytedance-research/UI-TARS-7B-DPO seems to support English, Russian, and Chinese (with varying quality based on informal testing). Is there any official data or benchmarks on its multilingual performance? Specifically:

Supported languages: Is there a full list?

Per-language metrics: Accuracy, fluency, or task-specific scores (e.g., MMLU, FLORES)?

DPO impact: Does preference optimization favor certain languages?

Any links to papers, GitHub docs, or internal stats would be appreciated!

unstopppable changed discussion title from Does UI-TARS-7B-DPO have official multilingual benchmarks (en/ru/zh)? to Does UI-TARS-7B-DPO have official multilingual benchmarks (en/ru/zh/etc)?
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment