How much VRAM is required? I have 8gb RTX 3060 and seems in sufficient

#20

by LawItApps - opened Mar 8

Mar 8

Hey all

I am running this model according to the instructions but it keeps saying I am running out of VRAM despite having 8gb with my RTX 3060.

Seems abnormal as the model is roughly 6gb.

Can anyone help with this? As I don't think I have sufficient VRAM to run the AWQ 7b model.

Neenem

Mar 17

Have you tried with 3b model?

Metal3d

Mar 26

This is the 3B model repository here. I've tried on Collab with a T4 GPU and 16Go RAM. It fails too...

OutOfMemoryError: CUDA out of memory. Tried to allocate 12.20 GiB. GPU 0 has a total capacity of 14.74 GiB of which 6.24 GiB is free. Process 2446 has 8.49 GiB memory in use. Of the allocated memory 8.28 GiB is allocated by PyTorch, and 96.62 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

There is something wrong in the snippet and I'm trying to find what.

LAI2153

Mar 26

Try reducing the image size a bit.

Metal3d

Mar 26

OK... the problem with the given example snippet is that the image is HUGE.
This worked on collab:

First, !pip install accelerate qwen_vl_utils

Then:

from PIL import Image
import requests

image = Image.open(requests.get("https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg", stream=True).raw).reduce(2).convert('RGB')
image.save('demo.jpeg')

It saves the image in a lower size, and I ensure that the image is RGB.

Then:

from transformers import Qwen2_5_VLForConditionalGeneration, AutoTokenizer, AutoProcessor
from qwen_vl_utils import process_vision_info
import torch

model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
     "Qwen/Qwen2.5-VL-3B-Instruct",
     torch_dtype=torch.float16, # I use float16, not bfloat16, but maybe it works with default
     device_map="auto",
)
# this can help
model = torch.compile(model)

processor = AutoProcessor.from_pretrained("Qwen/Qwen2.5-VL-3B-Instruct")

And finally:

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "image",
                "image": "file://demo.jpeg", # <== the reduced image
            },
            {"type": "text", "text": "Describe this image."},
        ],
    }
]

# Preparation for inference
text = processor.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)
image_inputs, video_inputs = process_vision_info(messages)
inputs = processor(
    text=[text],
    images=image_inputs,
    videos=video_inputs,
    padding=True,
    return_tensors="pt",
).to(model.device)

# Inference: Generation of the output
generated_ids = model.generate(**inputs, max_new_tokens=128)
generated_ids_trimmed = [
    out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
output_text = processor.batch_decode(
    generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
)
print(output_text)

This gives:

The image depicts a serene beach scene with a person and a dog sitting on the sand. The person is wearing a plaid shirt and black pants, and they appear to be smiling or laughing. The dog, which looks like a Labrador Retriever, is also sitting on the sand and is wearing a harness. The dog is extending its paw towards the person, possibly in a gesture of greeting or playfulness. The background shows the ocean with gentle waves lapping at the shore, and the sky is clear with a soft light suggesting either early morning or late afternoon. The overall atmosphere of the image is peaceful and joyful.

Metal3d

Mar 26

Try reducing the image size a bit.

Are we synchroniezed? :-) I just gave the same answer :-p

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment