TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action Paper • 2505.01583 • Published 4 days ago • 7
Kimi-VL-A3B Collection Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 6 items • Updated 25 days ago • 66
Modifying Large Language Model Post-Training for Diverse Creative Writing Paper • 2503.17126 • Published Mar 21 • 36
MambaVision Collection MambaVision: A Hybrid Mamba-Transformer Vision Backbone. Includes both 1K and 21K pretrained models. • 13 items • Updated 1 day ago • 31
💫StarVector Models Collection StarVector is a multimodal LLM for Scalable Vector Graphics (SVG) generation, producing structured SVG code directly from images and text. • 2 items • Updated Mar 20 • 93
Ovis2 Collection Our latest advancement in multi-modal large language models (MLLMs) • 15 items • Updated Mar 25 • 60
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated 8 days ago • 461
AutoPresent: Designing Structured Visuals from Scratch Paper • 2501.00912 • Published Jan 1 • 8
SketchAgent: Language-Driven Sequential Sketch Generation Paper • 2411.17673 • Published Nov 26, 2024 • 19
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks Paper • 2412.14161 • Published Dec 18, 2024 • 52
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 40 items • Updated 8 days ago • 310
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated 1 day ago • 257
Pangea Collection A Fully Open Multilingual Multimodal LLM for 39 Languages • 26 items • Updated Feb 1 • 18
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated 6 days ago • 303
OpenDevin: An Open Platform for AI Software Developers as Generalist Agents Paper • 2407.16741 • Published Jul 23, 2024 • 72