Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
DAMO-NLP-SG
/
VL3-SigLIP-NaViT
like
8
Follow
Language Technology Lab at Alibaba DAMO Academy
123
Image Feature Extraction
Transformers
Safetensors
English
videollama3_vision_encoder
feature-extraction
visual-encoder
multi-modal-large-language-model
custom_code
arxiv:
2501.13106
arxiv:
2406.07476
arxiv:
2306.02858
License:
apache-2.0
Model card
Files
Files and versions
Community
6
Train
Use this model
main
VL3-SigLIP-NaViT
/
README.md
Commit History
Update README.md
557050f
verified
Cyril666
commited on
Jan 24
Update README.md
68771d7
verified
Cyril666
commited on
Jan 24
Update README.md
bd1afa4
verified
Cyril666
commited on
Jan 24
Update README.md
6217c34
verified
Cyril666
commited on
Jan 24
Update README.md
21cb29e
verified
Cyril666
commited on
Jan 24
Update README.md
ed2c154
verified
Cyril666
commited on
Jan 24
Update README.md
50d747a
verified
Cyril666
commited on
Jan 24
Upload processor
592e852
verified
ClownRat
commited on
Jan 21