Sherlock: Self-Correcting Reasoning in Vision-Language Models Paper • 2505.22651 • Published 6 days ago • 50
The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs Paper • 2504.17768 • Published Apr 24 • 13
CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training Paper • 2504.13161 • Published Apr 17 • 92
SEAP: Training-free Sparse Expert Activation Pruning Unlock the Brainpower of Large Language Models Paper • 2503.07605 • Published Mar 10 • 69
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published Feb 13 • 194
Pippo: High-Resolution Multi-View Humans from a Single Image Paper • 2502.07785 • Published Feb 11 • 11
Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving Paper • 2502.07640 • Published Feb 11 • 8
Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents Paper • 2502.04223 • Published Feb 6 • 11
VidCRAFT3: Camera, Object, and Lighting Control for Image-to-Video Generation Paper • 2502.07531 • Published Feb 11 • 13
Next Block Prediction: Video Generation via Semi-Autoregressive Modeling Paper • 2502.07737 • Published Feb 11 • 9
Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance Paper • 2502.06145 • Published Feb 10 • 16
Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey Paper • 2502.06872 • Published Feb 8 • 8
TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation Paper • 2502.07870 • Published Feb 11 • 45