Tianyu Pang's picture

Tianyu Pang

P2333

·

https://p2333.github.io/

P2333

AI & ML interests

Machine Learning

Recent Activity

upvoted a paper 5 days ago

Fostering Video Reasoning via Next-Event Prediction

commented on a paper 5 days ago

Fostering Video Reasoning via Next-Event Prediction

upvoted a paper 6 days ago

Reinforcing General Reasoning without Verifiers

View all activity

Organizations

None yet

P2333's activity

upvoted a paper 5 days ago

Fostering Video Reasoning via Next-Event Prediction

Paper • 2505.22457 • Published 6 days ago • 27

upvoted 2 papers 6 days ago

Reinforcing General Reasoning without Verifiers

Paper • 2505.21493 • Published 7 days ago • 26

Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment

Paper • 2505.21494 • Published 7 days ago • 8

upvoted a paper 8 days ago

Lifelong Safety Alignment for Language Models

Paper • 2505.20259 • Published 8 days ago • 23

upvoted a paper 10 days ago

BanditSpec: Adaptive Speculative Decoding via Bandit Algorithms

Paper • 2505.15141 • Published 13 days ago • 4

upvoted a paper 11 days ago

QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design

Paper • 2505.16175 • Published 13 days ago • 39

upvoted a paper 13 days ago

Optimizing Anytime Reasoning via Budget Relative Policy Optimization

Paper • 2505.13438 • Published 15 days ago • 35

upvoted a collection about 2 months ago

🚀 Active PRM

Efficient Process Reward Model Training via Active Learning. • 4 items • Updated Apr 16 • 3

upvoted 2 papers about 2 months ago

Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models

Paper • 2412.18605 • Published Dec 24, 2024 • 21

NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation

Paper • 2504.13055 • Published Apr 17 • 19

upvoted a collection about 2 months ago

NoisyRollout

8 items • Updated 15 days ago • 6

upvoted a paper about 2 months ago

Efficient Process Reward Model Training via Active Learning

Paper • 2504.10559 • Published Apr 14 • 13

upvoted a paper 2 months ago

Understanding R1-Zero-Like Training: A Critical Perspective

Paper • 2503.20783 • Published Mar 26 • 49

upvoted a collection 2 months ago

🌾Oat-Zero: Understanding R1-Zero-Like Training

5 items • Updated Apr 10 • 7

upvoted 4 collections 6 months ago

⚓️ Sailor Language Models

Sailor: Open Language Models tailored for South-East Asia (SEA) released by Sea AI Lab. • 17 items • Updated Dec 3, 2024 • 17

📈 Scaling Laws with Vocabulary

Increase your vocabulary size when you scale up your language model • 5 items • Updated Aug 11, 2024 • 6

🧬 RegMix: Data Mixture as Regression

Automatic data mixture method for large language model pre-training • 10 items • Updated Jul 26, 2024 • 8

🔱 Sailor2 Language Models

Sailing in South-East Asia with Inclusive Multilingual LLMs • 34 items • Updated Feb 24 • 28

upvoted a paper 8 months ago

Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates

Paper • 2410.07137 • Published Oct 9, 2024 • 7

upvoted a collection 11 months ago

💡 DICE

Self-alignment with DPO Implicit Rewards • 5 items • Updated Jul 28, 2024 • 9