18 31 2

Tianyi Zhou

zhoutianyi

https://tianyizhou.github.io/

AI & ML interests

ML, NLP, RL, Multi-modality

Recent Activity

upvoted a paper 6 days ago

Exploring Expert Failures Improves LLM Agent Tuning

upvoted a paper 11 days ago

GraphicBench: A Planning Benchmark for Graphic Design with Language Agents

authored a paper 19 days ago

WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents

View all activity

Organizations

zhoutianyi's activity

upvoted a paper 6 days ago

Exploring Expert Failures Improves LLM Agent Tuning

Paper • 2504.13145 • Published 24 days ago • 12

upvoted a paper 11 days ago

GraphicBench: A Planning Benchmark for Graphic Design with Language Agents

Paper • 2504.11571 • Published 26 days ago • 1

authored a paper 19 days ago

WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents

Paper • 2504.15785 • Published 20 days ago • 19

upvoted a paper 19 days ago

WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents

Paper • 2504.15785 • Published 20 days ago • 19

commented 2 papers 19 days ago

WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents

Paper • 2504.15785 • Published 20 days ago • 19 •

WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents

Paper • 2504.15785 • Published 20 days ago • 19 •

authored 2 papers 24 days ago

GraphicBench: A Planning Benchmark for Graphic Design with Language Agents

Paper • 2504.11571 • Published 26 days ago • 1

Exploring Expert Failures Improves LLM Agent Tuning

Paper • 2504.13145 • Published 24 days ago • 12

commented 2 papers 24 days ago

Exploring Expert Failures Improves LLM Agent Tuning

Paper • 2504.13145 • Published 24 days ago • 12 •

Exploring Expert Failures Improves LLM Agent Tuning

Paper • 2504.13145 • Published 24 days ago • 12 •

liked a dataset 24 days ago

umd-zhou-lab/ColorBench

Viewer • Updated 23 days ago • 5.81k • 270 • 3

authored 3 papers 25 days ago

On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective

Paper • 2502.14296 • Published Feb 20 • 46

AutoBench-V: Can Large Vision-Language Models Benchmark Themselves?

Paper • 2410.21259 • Published Oct 28, 2024 • 1

ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness

Paper • 2504.10514 • Published Apr 10 • 45

upvoted a paper 25 days ago

ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness

Paper • 2504.10514 • Published Apr 10 • 45

commented 3 papers 25 days ago

ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness

Paper • 2504.10514 • Published Apr 10 • 45 •

ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness

Paper • 2504.10514 • Published Apr 10 • 45 •

ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness

Paper • 2504.10514 • Published Apr 10 • 45 •

authored 2 papers 26 days ago

Efficient Reinforcement Finetuning via Adaptive Curriculum Learning

Paper • 2504.05520 • Published Apr 7 • 10

How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients

Paper • 2504.10766 • Published 27 days ago • 40