Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
0-hero 's Collections
R1-GRPO-Math-Python-Code-Experiments
Prompt Perfect
GPT-2 Experiment
Matter-0.1
Matter 0.2

R1-GRPO-Math-Python-Code-Experiments

updated 2 days ago

Lora & full finetune experiments on r1 distills to generate python code for math problems

Upvote
-

  • 0-hero/r1-7B-grpo-v3.3-epoch-3

    Updated Mar 28

  • 0-hero/r1-7B-grpo-v3.3-epoch-2

    Updated Mar 28

  • 0-hero/r1-7B-grpo-v3.3-epoch-1

    Updated Mar 28

  • 0-hero/r1-7B-grpo-v3.2-epoch-1

    Updated Mar 27

  • 0-hero/r1-7B-grpo-v3.2-epoch-2

    Updated Mar 27

  • 0-hero/r1-14B-grpo-v3.1-epoch-2

    Updated Mar 26

  • 0-hero/r1-14B-grpo-v3.1-epoch-1

    Updated Mar 26

  • 0-hero/r1-7B-grpo-v3.1-epoch-3

    Updated Mar 24

  • 0-hero/r1-7B-grpo-v3.1-epoch-2

    Updated Mar 24

  • 0-hero/r1-7B-grpo-v2-temp-1.0-60

    Updated Mar 23

  • 0-hero/r1-14B-math-grpo-165

    Updated Mar 12

  • 0-hero/r1-14B-math-grpo-80

    Updated Mar 11

  • 0-hero/r1-7B-grpo-850

    Updated Mar 10

  • 0-hero/r1-7B-grpo-710

    Updated Mar 10

  • 0-hero/r1-7B-grpo-610

    Updated Mar 10

  • 0-hero/r1-7B-grpo-80

    Updated Mar 10

  • 0-hero/R1-7B-MATH-GRPO-FULL

    Updated Mar 9

  • 0-hero/R1-14B-GRPO

    Updated Mar 8

  • 0-hero/r1-7b-grpo-full

    Updated Mar 6

  • 0-hero/r1-8b-grpo-full

    Updated Mar 6
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs