17 94 20

KABI

dongguanting

https://dongguanting.github.io/

AI & ML interests

Reasoning and Alignment for Large Language Models

Recent Activity

upvoted a paper about 19 hours ago

DeepCritic: Deliberate Critique with Large Language Models

authored a paper 3 days ago

WebThinker: Empowering Large Reasoning Models with Deep Research Capability

upvoted a paper 6 days ago

Phi-4-reasoning Technical Report

View all activity

Organizations

dongguanting's activity

upvoted a paper about 19 hours ago

DeepCritic: Deliberate Critique with Large Language Models

Paper • 2505.00662 • Published 6 days ago • 46

upvoted 2 papers 6 days ago

Phi-4-reasoning Technical Report

Paper • 2504.21318 • Published 8 days ago • 34

Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math

Paper • 2504.21233 • Published 8 days ago • 37

upvoted a paper 7 days ago

WebThinker: Empowering Large Reasoning Models with Deep Research Capability

Paper • 2504.21776 • Published 7 days ago • 41

upvoted a collection 17 days ago

Search-R1

Collection

Preliminary checkpoints with outcome-only RL. • 14 items • Updated about 1 month ago • 5

upvoted a paper about 1 month ago

Inference-Time Scaling for Generalist Reward Modeling

Paper • 2504.02495 • Published Apr 3 • 54

upvoted a collection about 1 month ago

BGE

Collection

23 items • Updated Feb 13 • 111

upvoted 2 papers about 2 months ago

Why Do Multi-Agent LLM Systems Fail?

Paper • 2503.13657 • Published Mar 17 • 45

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 111

upvoted 2 papers 3 months ago

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published Feb 20 • 103

RAG-Star: Enhancing Deliberative Reasoning with Retrieval Augmented Verification and Refinement

Paper • 2412.12881 • Published Dec 17, 2024 • 2

upvoted 9 papers 4 months ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 276

URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics

Paper • 2501.04686 • Published Jan 8 • 54

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Paper • 2501.05366 • Published Jan 9 • 101

Virgo: A Preliminary Exploration on Reproducing o1-like MLLM

Paper • 2501.01904 • Published Jan 3 • 34