SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning Paper β’ 2505.12448 β’ Published 16 days ago β’ 10
Unicorn: Text-Only Data Synthesis for Vision Language Model Training Paper β’ 2503.22655 β’ Published Mar 28 β’ 39
BELLE-2/Belle-whisper-large-v3-turbo-zh Automatic Speech Recognition β’ Updated Dec 16, 2024 β’ 1.3k β’ 57
Running 38 38 YOLOv10 Document Layout Analysis π Analyze scanned documents to detect and label content