zhehuderek
/

qwen2_5_vl_3b_GEOQA_8K_hf

Image-Text-to-Text

text-generation-inference

Model card Files Files and versions Community

Model Card for Model ID

Base model: Qwen/Qwen2.5-VL-3B-Instruct
Training: GRPO with leonardPKU/GEOQA_8K_R1V
Training log on wandb: https://wandb.ai/ddderek-hk-polyu/easy_r1/runs/d1xtspm0
Total step of 70, not converged yet

Downloads last month: 7

Safetensors

Model size

3.75B params

Tensor type

BF16

·

Inference Providers NEW

Image-Text-to-Text

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including zhehuderek/qwen2_5_vl_3b_GEOQA_8K_hf

vlm_grpo

VLM with GRPO training for enhanced reasoning • 2 items • Updated Apr 9