Submitted by cg1177 29 Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models · 21 authors 2
Submitted by mpark 20 SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent Representation · 5 authors 1
Submitted by wchengad 18 StyleMe3D: Stylization with Disentangled Priors by Multiple Encoders on 3D Gaussians · 10 authors 1
Submitted by salmannyu 18 X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents · 10 authors 1
Submitted by Swtheking 10 LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient Training of Code LLMs · 8 authors 1
Submitted by frog123123123123 9 Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs · 10 authors 1
Submitted by pengxiang 8 InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners · 8 authors 1
Submitted by Ningyu 6 EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models · 10 authors 1
Submitted by ewrfcas 6 Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation · 8 authors 1
Submitted by Yuxiang007 6 LearnAct: Few-Shot Mobile GUI Agent with a Unified Demonstration Benchmark · 9 authors 1
Submitted by manuelkansy 6 LookingGlass: Generative Anamorphoses via Laplacian Pyramid Warping · 5 authors 5
Submitted by bys0318 4 An LMM for Efficient Video Understanding via Reinforced Compression of Video Cubes · 7 authors 1
Submitted by quyanh 3 RainbowPlus: Enhancing Adversarial Prompt Generation via Evolutionary Quality-Diversity Search · 3 authors 4
Submitted by SieraL 3 NEMOTRON-CROSSTHINK: Scaling Self-Learning beyond Math Reasoning · 11 authors 3