Long-Context Autoregressive Video Modeling with Next-Frame Prediction Paper • 2503.19325 • Published 25 days ago • 71
CoMP: Continual Multimodal Pre-training for Vision Foundation Models Paper • 2503.18931 • Published 26 days ago • 29