29 248 28

Orr Zohar PRO

orrzohar

https://orrzohar.github.io

AI & ML interests

Large Multi-Modal Models, Foundation Models, Video Understanding

Recent Activity

upvoted a paper about 21 hours ago

Transformers without Normalization

upvoted a paper 2 days ago

Long Context Tuning for Video Generation

upvoted a paper 2 days ago

VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search

View all activity

Organizations

orrzohar's activity

upvoted a paper about 21 hours ago

Transformers without Normalization

Paper • 2503.10622 • Published 3 days ago • 85

upvoted 4 papers 2 days ago

Long Context Tuning for Video Generation

Paper • 2503.10589 • Published 3 days ago • 13

VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search

Paper • 2503.10582 • Published 3 days ago • 16

Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models

Paper • 2503.09669 • Published 4 days ago • 31

CoSTAast: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing

Paper • 2503.10613 • Published 3 days ago • 61

upvoted a paper 3 days ago

VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary

Paper • 2503.09402 • Published 4 days ago • 6

upvoted a paper 4 days ago

Video Action Differencing

Paper • 2503.07860 • Published 6 days ago • 28

upvoted 3 papers 9 days ago

upvoted an article 12 days ago

Article

A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality

13 days ago

• 66

upvoted a collection 12 days ago

C4AI Aya Vision

Collection

Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages. • 5 items • Updated 12 days ago • 64

liked a dataset 13 days ago

BIOMEDICA/biomedica_webdataset_24M

Updated Jan 22 • 3.84k • 24

liked 2 models 13 days ago

BIOMEDICA/BMC_CLIP_CF

Updated Jan 17 • 8

HuggingFaceTB/SmolVLM2-2.2B-Instruct

Image-Text-to-Text • Updated 10 days ago • 505k • 112

updated a model 14 days ago

HuggingFaceTB/SmolVLM2-500M-Video-Instruct

Image-Text-to-Text • Updated 10 days ago • 7.55k • 42

New activity in HuggingFaceTB/SmolVLM2-500M-Video-Instruct 14 days ago

Update README.md

#15 opened 14 days ago by

orrzohar

updated a model 14 days ago

HuggingFaceTB/SmolVLM2-256M-Video-Instruct

Image-Text-to-Text • Updated 10 days ago • 5.46k • 41

New activity in HuggingFaceTB/SmolVLM2-256M-Video-Instruct 14 days ago

Update README.md

#8 opened 14 days ago by

orrzohar

New activity in HuggingFaceTB/SmolVLM2-2.2B-Instruct 14 days ago

Input Video length constraints

#6 opened 23 days ago by

NikhilJoson