Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
fangwu97
/
Qwen2.5-0.5B-Instruct-GRPO-test
like
0
Transformers
Safetensors
BytedTsinghua-SIA/DAPO-Math-17k
Generated from Trainer
trl
grpo
arxiv:
2402.03300
Model card
Files
Files and versions
Community
1
Train
Deploy
Use this model
main
Qwen2.5-0.5B-Instruct-GRPO-test
Commit History
Training in progress, step 10
c367bf0
verified
fangwu97
commited on
29 days ago
End of training
bbfc410
verified
fangwu97
commited on
29 days ago
Training in progress, step 10
db341c8
verified
fangwu97
commited on
29 days ago
initial commit
a2d2bbc
verified
fangwu97
commited on
30 days ago