Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
johko 's Collections
Point Tracking
Consistent Image Generation
Deceptive Prompts for MLLMs
VLM Interleaved Images
Virtual Try-On
Text driven Image Editing

VLM Interleaved Images

updated Jul 12, 2024
Upvote
-

  • LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models

    Paper • 2407.07895 • Published Jul 10, 2024 • 43

  • SEED-Story: Multimodal Long Story Generation with Large Language Model

    Paper • 2407.08683 • Published Jul 11, 2024 • 26

  • ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation

    Paper • 2407.06135 • Published Jul 8, 2024 • 23

  • InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

    Paper • 2407.03320 • Published Jul 3, 2024 • 96
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs