Submitted by xuchensong 38 Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning · 13 authors 1
Submitted by hongyuw 19 BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs · 3 authors 1
Submitted by YunxinLi 17 VideoVista-CulturalLingo: 360^circ Horizons-Bridging Cultures, Languages, and Domains in Video Comprehension · 7 authors 1
Submitted by HanleiZhang 9 Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark · 8 authors 1
Submitted by pnawrot 8 The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs · 6 authors 2
Submitted by carpedkm 7 Subject-driven Video Generation via Disentangled Identity and Motion · 7 authors 1
Submitted by amazingj 5 DianJin-R1: Evaluating and Enhancing Financial Reasoning in Large Language Models · 7 authors 1
Submitted by zaplm 5 DC-SAM: In-Context Segment Anything in Images and Videos via Dual Consistency · 7 authors 1
Submitted by alemiaschi - Optimizing LLMs for Italian: Reducing Token Fertility and Enhancing Efficiency Through Vocabulary Adaptation · 9 authors