view article Article Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval Mar 22, 2024 • 88
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published 1 day ago • 61
Words or Vision: Do Vision-Language Models Have Blind Faith in Text? Paper • 2503.02199 • Published Mar 4 • 8
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper • 2502.15007 • Published Feb 20 • 175
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation Paper • 2502.14846 • Published Feb 20 • 13
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16 • 156
view article Article π0 and π0-FAST: Vision-Language-Action Models for General Robot Control Feb 4 • 153
WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training Paper • 2501.18511 • Published Jan 30 • 20
Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling Paper • 2501.16975 • Published Jan 28 • 31
MinMo: A Multimodal Large Language Model for Seamless Voice Interaction Paper • 2501.06282 • Published Jan 10 • 52