Gemma 3 Collection A collection of lightweight, state-of-the-art open models built from the same research and technology that powers the Gemini 2.0 models • 31 items • Updated 4 days ago • 14
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM 5 days ago • 269
Jamba 1.5 Collection The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models • 2 items • Updated 10 days ago • 85
view article Article A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality 13 days ago • 66
C4AI Aya Vision Collection Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages. • 5 items • Updated 12 days ago • 64
Adding Conditional Control to Text-to-Image Diffusion Models Paper • 2302.05543 • Published Feb 10, 2023 • 49
AIMv2 Collection A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. • 19 items • Updated Nov 22, 2024 • 74
view article Article PaliGemma 2 Mix - New Instruction Vision Language Models by Google 26 days ago • 65
view article Article Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita 🔥 27 days ago • 93
BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature Paper • 2501.07171 • Published Jan 13 • 50
Scaling Pre-training to One Hundred Billion Data for Vision Language Models Paper • 2502.07617 • Published Feb 11 • 29
Qwen2-VL Collection Vision-language model series based on Qwen2 • 16 items • Updated Dec 6, 2024 • 208