40 57 225

Dongfu Jiang

DongfuJiang

https://jdf-prog.github.io/

AI & ML interests

Large Language Model, Modality Reasoning and their evaluation

Recent Activity

liked a dataset 21 minutes ago

RUC-AIBOX/STILL-3-TOOL-32B-Data

published a dataset about 11 hours ago

VerlTool/ToRL-Math

updated a model about 19 hours ago

VerlTool/acecoder-qwen2.5-coder-1.5b-grpo-n16-b128-t1.0-lr1e-6-5-turns-force-reflect-410-step

View all activity

Organizations

DongfuJiang's activity

liked a dataset 21 minutes ago

RUC-AIBOX/STILL-3-TOOL-32B-Data

Viewer • Updated Feb 28 • 820 • 124 • 3

published a dataset about 11 hours ago

VerlTool/ToRL-Math

Viewer • Updated 2 days ago • 29.1k • 1

updated a model about 19 hours ago

VerlTool/acecoder-qwen2.5-coder-1.5b-grpo-n16-b128-t1.0-lr1e-6-5-turns-force-reflect-410-step

Updated about 19 hours ago

published a model about 19 hours ago

VerlTool/acecoder-qwen2.5-coder-1.5b-grpo-n16-b128-t1.0-lr1e-6-5-turns-force-reflect-410-step

Updated about 19 hours ago

updated a model 1 day ago

VerlTool/torl-fsdp_agent-qwen_qwen2.5-math-1.5b-grpo-n16-b128-t1.0-lr1e-6torl_same_train-310-step

Updated 1 day ago

published a model 1 day ago

VerlTool/torl-fsdp_agent-qwen_qwen2.5-math-1.5b-grpo-n16-b128-t1.0-lr1e-6torl_same_train-310-step

Updated 1 day ago

updated 2 models 1 day ago

VerlTool/torl-fsdp_agent-qwen_qwen2.5-math-1.5b-grpo-n16-b128-t1.0-lr1e-6new-v2-430-step

Updated 1 day ago • 7

VerlTool/torl-fsdp_agent-qwen_qwen2.5-math-7b-grpo-n16-b128-t1.0-lr1e-6new-v2-430-step

Updated 1 day ago • 6

published a model 1 day ago

VerlTool/torl-fsdp_agent-qwen_qwen2.5-math-7b-grpo-n16-b128-t1.0-lr1e-6new-v2-430-step

Updated 1 day ago • 6

updated a model 2 days ago

VerlTool/acecoder-fsdp_agent-qwen_qwen2.5-coder-1.5b-grpo-n16-b128-t1.0-lr1e-6new-580-step

Updated 2 days ago • 2

upvoted a paper 2 days ago

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Paper • 2504.20571 • Published 3 days ago • 71

updated a dataset 2 days ago

VerlTool/ToRL-Math

Viewer • Updated 2 days ago • 29.1k • 1

updated 2 models 2 days ago

VerlTool/torl-fsdp_agent-qwen_qwen2.5-7b-grpo-n16-b128-t1.0-lr1e-6new-190-step

Updated 2 days ago • 7

VerlTool/torl-fsdp_agent-qwen_qwen2.5-math-7b-grpo-n16-b128-t1.0-lr1e-6new-220-step

Updated 2 days ago • 5

published 3 models 2 days ago

VerlTool/torl-fsdp_agent-qwen_qwen2.5-7b-grpo-n16-b128-t1.0-lr1e-6new-190-step

Updated 2 days ago • 7

VerlTool/torl-fsdp_agent-qwen_qwen2.5-math-7b-grpo-n16-b128-t1.0-lr1e-6new-220-step

Updated 2 days ago • 5

VerlTool/torl-fsdp_agent-qwen_qwen2.5-math-1.5b-grpo-n16-b128-t1.0-lr1e-6new-v2-430-step

Updated 1 day ago • 7

published a model 3 days ago

VerlTool/acecoder-fsdp_agent-qwen_qwen2.5-coder-1.5b-grpo-n16-b128-t1.0-lr1e-6new-580-step

Updated 2 days ago • 2

updated a model 3 days ago

VerlTool/acecoder-fsdp_agent-qwen_qwen2.5-coder-7b-grpo-n16-b128-t1.0-lr1e-6new-210-step

Updated 3 days ago

published a model 3 days ago

VerlTool/acecoder-fsdp_agent-qwen_qwen2.5-coder-7b-grpo-n16-b128-t1.0-lr1e-6new-210-step

Updated 3 days ago