File size: 5,389 Bytes
b74e18c 7a60969 b74e18c 305557f b74e18c 3fe6c85 94494f0 b74e18c 305557f b74e18c 305557f 94494f0 305557f b74e18c 7a60969 a9fe3a6 b74e18c 363b5fe |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 |
---
license: apache-2.0
base_model:
- DeepGlint-AI/MLCD-Embodied-7B
---
[](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcocog?p=multi-label-cluster-discrimination-for-visual)
[](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcoco-5?p=multi-label-cluster-discrimination-for-visual)
[](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcoco-3?p=multi-label-cluster-discrimination-for-visual)
[](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcocog-1?p=multi-label-cluster-discrimination-for-visual)
[](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcoco-8?p=multi-label-cluster-discrimination-for-visual)
[](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcoco-4?p=multi-label-cluster-discrimination-for-visual)
[](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcoco-9?p=multi-label-cluster-discrimination-for-visual)
[](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcoco?p=multi-label-cluster-discrimination-for-visual)
[](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcoco?p=multi-label-cluster-discrimination-for-visual)
## RefCOCO Segmentation Evaluation:
| Dataset | Split | MLCD-seg-7B | EVF-SAM | GLaMM | VisionLLM v2| LISA |
| :-- | :-: | :-: | :-: | :-: | :-: | :-: |
| RefCOCO | val | **83.6** | 82.4 | 79.5 | 79.2 | 74.9 |
| RefCOCO | testA | **85.3** | 84.2 | 83.2 | 82.3 | 79.1 |
| RefCOCO | testB | **81.5** | 80.2 | 76.9 | 77.0 | 72.3 |
| RefCOCO+ | val | **79.4** | 76.5 | 72.6 | 68.9 | 65.1 |
| RefCOCO+ | testA | **82.9** | 80.0 | 78.7 | 75.8 | 70.8 |
| RefCOCO+ | testB | **75.6** | 71.9 | 64.6 | 61.8 | 58.1 |
| RefCOCOg | val | **79.7** | 78.2 | 74.2 | 73.3 | 67.9 |
| RefCOCOg | test | **80.5** | 78.3 | 74.9 | 74.8 | 70.6 |
## Evaluation
If you just want to use this code, please refer to this sample below
```python
from transformers import AutoModel, AutoTokenizer
from PIL import Image
model_path = "DeepGlint-AI/MLCD-Seg" # or use your local path
mlcd_seg = AutoModel.from_pretrained(
model_path,
torch_dtype=torch.float16,
trust_remote_code=True
).cuda()
tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False)
# Assuming you have an image named test.jpg
seg_img = Image.open("test.jpg").convert('RGB')
seg_prompt = "Could you provide a segmentation mask for the right giraffe in this image?"
pred_mask = model.seg(seg_img, seg_prompt, tokenizer, force_seg=False)
```
If you want to use this code measurement dataset (e.g. refcoco), then you need to use the following method
```python
from transformers import AutoModel, AutoTokenizer
from PIL import Image
model_path = "DeepGlint-AI/MLCD-Seg" # or use your local path
mlcd_seg = AutoModel.from_pretrained(
model_path,
torch_dtype=torch.float16,
trust_remote_code=True
).cuda()
tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False)
# Assuming you have an image named test.jpg
seg_img = Image.open("test.jpg").convert('RGB')
seg_prompt = "Could you provide a segmentation mask for the right giraffe in this image?"
pred_mask = model.seg(seg_img, seg_prompt, tokenizer, force_seg=True)
```
## Example
<img src="https://github.com/user-attachments/assets/85c023a1-3e0c-4ea5-a764-1eb9ee0fbddf" alt="output" width="1024"/>
<img src="https://github.com/user-attachments/assets/5b767327-bd0a-4185-8f7e-b1ab0aa260c9" alt="output" width="1024"/>
## Citations
```
@misc{mlcdseg_wukun,
author = {Wu, Kun and Xie, Yin and Zhou, Xinyu and An, Xiang, and Deng, Jiankang, and Jie, Yu},
title = {MLCD-Seg},
year = {2025},
url = {https://github.com/deepglint/unicom/tree/main/downstream},
}
```
|