Submitted by MiniMax-AI 103 MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder · 20 authors 3
Submitted by ZacharyNovack 14 Fast Text-to-Audio Generation with Adversarial Post-Training · 11 authors 2
Submitted by akhaliq 10 AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale · 8 authors 2
Submitted by Junjie-Ye 8 A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models · 15 authors 2
Submitted by akhaliq 7 Aya Vision: Advancing the Frontier of Multilingual Multimodality · 25 authors 2
Submitted by jinghan23 7 Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging · 8 authors 2
Submitted by Omartificial-Intelligence-Space 5 Advancing Arabic Reverse Dictionary Systems: A Transformer-Based Approach with Dataset Construction Guidelines · 7 authors 2
Submitted by EdBianchi 4 SkillFormer: Unified Multi-View Video Understanding for Proficiency Estimation · 2 authors 2
Submitted by Omartificial-Intelligence-Space 2 Optimizing Retrieval-Augmented Generation: Analysis of Hyperparameter Impact on Performance and Efficiency · 4 authors 2
Submitted by taiwang 1 NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged Information Guidance · 9 authors 2
Submitted by trucnguyen28 1 ViMRHP: A Vietnamese Benchmark Dataset for Multimodal Review Helpfulness Prediction via Human-AI Collaborative Annotation · 4 authors 2
Submitted by onekq - Tests as Prompt: A Test-Driven-Development Benchmark for LLM Code Generation · 1 authors 2