VLM with GRPO training for enhanced reasoning
Derek Zhe Hu
zhehuderek
AI & ML interests
NLP, Multimodality
Recent Activity
updated
a model
1 day ago
zhehuderek/qwen2_5_vl_7b_sft_decisionmaking_gpt4reason
published
a model
1 day ago
zhehuderek/qwen2_5_vl_7b_sft_decisionmaking_gpt4reason
updated
a model
1 day ago
zhehuderek/qwen2_5_vl_3b_sft_decisionmaking_gpt4reason
Organizations
None yet
Collections
2
The collections of visual humor understanding and comparative reasoning.
-
zhehuderek/YESBUT_Benchmark
Viewer • Updated • 348 • 22 • 1 -
zhehuderek/YESBUT_Benchmark_V2
Viewer • Updated • 1.26k • 18 • 1 -
Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions
Paper • 2405.19088 • Published -
When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning?
Paper • 2503.23137 • Published
models
6
zhehuderek/qwen2_5_vl_7b_sft_decisionmaking_gpt4reason
Image-Text-to-Text
•
Updated
zhehuderek/qwen2_5_vl_3b_sft_decisionmaking_gpt4reason
Image-Text-to-Text
•
Updated
zhehuderek/qwen2_5_vl_3b_grpo_decisionmaking_scratch_run3_85
Image-Text-to-Text
•
Updated
•
7
zhehuderek/qwen2_5_vl_3b_GEOQA_8K_hf
Image-Text-to-Text
•
Updated
•
7
zhehuderek/llama-2-7b-chinese
Text Generation
•
Updated
•
2
zhehuderek/llama-3.1-8b-chinese-sft
Text Generation
•
Updated
•
1