SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published about 1 month ago • 180
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated 9 days ago • 463
Arbitrary-steps Image Super-resolution via Diffusion Inversion Paper • 2412.09013 • Published Dec 12, 2024 • 13
ColPali: Efficient Document Retrieval with Vision Language Models Paper • 2407.01449 • Published Jun 27, 2024 • 48
World Model on Million-Length Video And Language With RingAttention Paper • 2402.08268 • Published Feb 13, 2024 • 39
Running on CPU Upgrade 13k 13k Open LLM Leaderboard 🏆 Track, rank and evaluate open LLMs and chatbots
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision Paper • 2312.09390 • Published Dec 14, 2023 • 33