File size: 992 Bytes
08c215d
 
c160faa
 
acf8470
 
c160faa
 
 
 
 
acf8470
08c215d
 
acf8470
08c215d
 
c160faa
08c215d
c160faa
08c215d
c160faa
08c215d
a4888c1
 
 
 
c160faa
08c215d
 
1c31c1d
 
 
 
 
 
 
 
 
 
acf8470
1c31c1d
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
---
base_model: Qwen/Qwen2-VL-7B-Instruct
language:
- en
library_name: peft
license: mit
tags:
- LLM
- VLM
- Embedding
- Multimodal
pipeline_tag: image-text-to-text
---

```markdown
## Model Details

Instruction finetuned adapter for ABC: Acheiving Better Control of Multiomodal Embeddings using VLMs.

### Model Sources

This model is trained on top of Qwen2VL-Instruct.

### Paper and Website

For more information, please refer to [Website](https://tiger-ai-lab.github.io/ABC/).

## Citation

<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
```
@misc{schneider2025abcachievingbettercontrol,
      title={ABC: Achieving Better Control of Multimodal Embeddings using VLMs}, 
      author={Benjamin Schneider and Florian Kerschbaum and Wenhu Chen},
      year={2025},
      eprint={2503.00329},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2503.00329}, 
}
```
```