Xiangyan Liu

xyliu6

AI & ML interests

None yet

Recent Activity

upvoted a paper about 6 hours ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

upvoted a paper 5 days ago

Sherlock: Self-Correcting Reasoning in Vision-Language Models

upvoted a paper 5 days ago

Fostering Video Reasoning via Next-Event Prediction

View all activity

Organizations

None yet

xyliu6's activity

upvoted a paper about 6 hours ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published about 23 hours ago • 82

upvoted 2 papers 5 days ago

Sherlock: Self-Correcting Reasoning in Vision-Language Models

Paper • 2505.22651 • Published 6 days ago • 50

Fostering Video Reasoning via Next-Event Prediction

Paper • 2505.22457 • Published 6 days ago • 27

upvoted 2 papers 8 days ago

Lifelong Safety Alignment for Language Models

Paper • 2505.20259 • Published 8 days ago • 23

Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models

Paper • 2505.18536 • Published 10 days ago • 18

upvoted a paper 13 days ago

Optimizing Anytime Reasoning via Budget Relative Policy Optimization

Paper • 2505.13438 • Published 15 days ago • 35

authored 2 papers 14 days ago

Towards Robust Multi-Modal Reasoning via Model Selection

Paper • 2310.08446 • Published Oct 12, 2023

NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation

Paper • 2504.13055 • Published Apr 17 • 19

updated a model 15 days ago

xyliu6/NoisyRollout-Geo3K-32B

Updated 15 days ago • 5

updated a collection 15 days ago

NoisyRollout

Collection

8 items • Updated 15 days ago • 6

published a model 15 days ago

xyliu6/NoisyRollout-Geo3K-32B

Updated 15 days ago • 5

updated 2 models 15 days ago

xyliu6/NoisyRollout-MMK12-6.4K-32B

Updated 15 days ago • 6

xyliu6/NoisyRollout-MMK12-6.4K-7B

Updated 15 days ago • 5

published 2 models 15 days ago

xyliu6/NoisyRollout-MMK12-6.4K-7B

Updated 15 days ago • 5

xyliu6/NoisyRollout-MMK12-6.4K-32B

Updated 15 days ago • 6

upvoted 2 papers 21 days ago

DanceGRPO: Unleashing GRPO on Visual Generation

Paper • 2505.07818 • Published 22 days ago • 29

REFINE-AF: A Task-Agnostic Framework to Align Language Models via Self-Generated Instructions using Reinforcement Learning from Automated Feedback

Paper • 2505.06548 • Published 24 days ago • 30

updated a dataset about 1 month ago

xyliu6/k12-freeform-extended

Viewer • Updated Apr 27 • 15.4k • 45