shubhamprshr/Llama-3.2-3B-Instruct_blocksworld1246_sgrpo_cosine_0.5_0.5_True_1200 Text Generation • Updated 2 days ago • 4
shubhamprshr/Qwen2.5-3B-Instruct_math_sgrpo_cosine_0.5_0.5_True_1200 Text Generation • Updated 3 days ago
shubhamprshr/Qwen2.5-3B-Instruct_math_sgrpo_gaussian_0.25_0.75_True_1200 Text Generation • Updated 3 days ago
shubhamprshr/Qwen2.5-3B-Instruct_math_sgrpo_classic_0.5_0.5_True_1200 Text Generation • Updated 5 days ago
shubhamprshr/Qwen2.5-3B-Instruct_math_sgrpo_gaussian_0.5_0.5_True_1200 Text Generation • Updated 5 days ago
shubhamprshr/Qwen2.5-3B-Instruct_blocksworld8_sgrpo_balanced_0.5_0.5_True_1200 Text Generation • Updated 6 days ago • 1
shubhamprshr/Qwen2.5-3B-Instruct_math_sgrpo_balanced_0.5_0.5_True_1200 Text Generation • Updated 6 days ago • 1
shubhamprshr/Qwen2.5-3B-Instruct_blocksworld6_sgrpo_balanced_0.5_0.5_True_1200 Text Generation • Updated 7 days ago • 1
shubhamprshr/Qwen2.5-1.5B-Instruct_blocksworld246_sgrpo_balanced_0.5_0.5_True_1200 Text Generation • Updated 7 days ago