Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Dremin 's Collections
VLM

VLM

updated Sep 9, 2024
Upvote
-

  • ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

    Paper • 2406.04325 • Published Jun 6, 2024 • 76

  • MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

    Paper • 2401.15947 • Published Jan 29, 2024 • 53

  • Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

    Paper • 2311.10122 • Published Nov 16, 2023 • 27

  • Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models

    Paper • 2311.16103 • Published Nov 27, 2023 • 1

  • LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

    Paper • 2310.01852 • Published Oct 3, 2023 • 2
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs