Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
0-hero
's Collections
R1-GRPO-Math-Python-Code-Experiments
Prompt Perfect
GPT-2 Experiment
Matter-0.1
Matter 0.2
R1-GRPO-Math-Python-Code-Experiments
updated
2 days ago
Lora & full finetune experiments on r1 distills to generate python code for math problems
Upvote
-
0-hero/r1-7B-grpo-v3.3-epoch-3
Updated
Mar 28
0-hero/r1-7B-grpo-v3.3-epoch-2
Updated
Mar 28
0-hero/r1-7B-grpo-v3.3-epoch-1
Updated
Mar 28
0-hero/r1-7B-grpo-v3.2-epoch-1
Updated
Mar 27
0-hero/r1-7B-grpo-v3.2-epoch-2
Updated
Mar 27
0-hero/r1-14B-grpo-v3.1-epoch-2
Updated
Mar 26
0-hero/r1-14B-grpo-v3.1-epoch-1
Updated
Mar 26
0-hero/r1-7B-grpo-v3.1-epoch-3
Updated
Mar 24
0-hero/r1-7B-grpo-v3.1-epoch-2
Updated
Mar 24
0-hero/r1-7B-grpo-v2-temp-1.0-60
Updated
Mar 23
0-hero/r1-14B-math-grpo-165
Updated
Mar 12
0-hero/r1-14B-math-grpo-80
Updated
Mar 11
0-hero/r1-7B-grpo-850
Updated
Mar 10
0-hero/r1-7B-grpo-710
Updated
Mar 10
0-hero/r1-7B-grpo-610
Updated
Mar 10
0-hero/r1-7B-grpo-80
Updated
Mar 10
0-hero/R1-7B-MATH-GRPO-FULL
Updated
Mar 9
0-hero/R1-14B-GRPO
Updated
Mar 8
0-hero/r1-7b-grpo-full
Updated
Mar 6
0-hero/r1-8b-grpo-full
Updated
Mar 6
Upvote
-
Share collection
View history
Collection guide
Browse collections