Boosting Generative Image Modeling via Joint Image-Feature Synthesis Paper • 2504.16064 • Published Apr 22 • 14
LoftUp: Learning a Coordinate-Based Feature Upsampler for Vision Foundation Models Paper • 2504.14032 • Published Apr 18 • 4
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning Paper • 2504.17192 • Published Apr 24 • 110
MonoPlace3D: Learning 3D-Aware Object Placement for 3D Monocular Detection Paper • 2504.06801 • Published Apr 9 • 5
Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction Paper • 2504.07961 • Published Apr 10 • 6
Tokenize Image Patches: Global Context Fusion for Effective Haze Removal in Large Images Paper • 2504.09621 • Published Apr 13 • 12
HiScene: Creating Hierarchical 3D Scenes with Isometric View Generation Paper • 2504.13072 • Published Apr 17 • 13
DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging Paper • 2504.12364 • Published Apr 16 • 21
InteractVLM: 3D Interaction Reasoning from 2D Foundational Models Paper • 2504.05303 • Published Apr 7 • 5
FlexIP: Dynamic Control of Preservation and Personality for Customized Image Generation Paper • 2504.07405 • Published Apr 10 • 12
Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images Paper • 2504.08727 • Published Apr 11 • 11
BlockGaussian: Efficient Large-Scale Scene Novel View Synthesis via Adaptive Block-Based Gaussian Splatting Paper • 2504.09048 • Published Apr 12 • 8
REPA-E: Unlocking VAE for End-to-End Tuning with Latent Diffusion Transformers Paper • 2504.10483 • Published Apr 14 • 21
PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding Paper • 2504.13180 • Published Apr 17 • 17
SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning Paper • 2505.12448 • Published 14 days ago • 10