Submitted by roadjiang 100 Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model · 54 authors 10
Submitted by BestWishYsh 37 MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft · 7 authors 3
Submitted by YuuTennYi 36 GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation · 5 authors 2
Submitted by ZhuangXialie 20 SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning · 6 authors 2
Submitted by tianchez 19 VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model · 12 authors 2
Submitted by yeates 14 ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration · 10 authors 2
Submitted by BestWishYsh 8 FlexIP: Dynamic Control of Preservation and Personality for Customized Image Generation · 4 authors 2
Submitted by akhaliq 7 Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images · 7 authors 2
Submitted by stefan-it 7 ModernBERT or DeBERTaV3? Examining Architecture and Data Influence on Transformer Encoder Models Performance · 3 authors 3
Submitted by AdinaY 7 Pangu Ultra: Pushing the Limits of Dense Large Language Models on Ascend NPUs · 52 authors 3
Submitted by DannyLan 7 Do PhD-level LLMs Truly Grasp Elementary Addition? Probing Rule Learning vs. Memorization in Large Language Models · 4 authors 4
Submitted by sauradip 5 In-2-4D: Inbetweening from Two Single-View Images to 4D Generation · 4 authors 2
Submitted by jialuliluka 4 Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization · 6 authors 2
Submitted by richard-guyunqi 4 BlenderGym: Benchmarking Foundational Model Systems for Graphics Editing · 5 authors 2
Submitted by nielsr 3 UKBOB: One Billion MRI Labeled Masks for Generalizable 3D Medical Image Segmentation · 3 authors 2
Submitted by gabrielelozupone98 2 Latent Diffusion Autoencoders: Toward Efficient and Meaningful Unsupervised Representation Learning in Medical Imaging · 6 authors 2
Submitted by aashiqmuhamed 2 SAEs Can Improve Unlearning: Dynamic Sparse Autoencoder Guardrails for Precision Unlearning in LLMs · 4 authors 2
Submitted by ruipeterpan 1 SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning · 6 authors 2
Submitted by saidwivedi - InteractVLM: 3D Interaction Reasoning from 2D Foundational Models · 7 authors 2